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© Cellular immunity vaccines from bacterial toxin-antigen conjugates. 

© Recombinant hybrid proteins having two primary components. The first component is a modified bacterial 
toxin that has translocating ability, while the second component is a polypeptide or protein that is exogeno us to 
an antigen-presenting cell. The hybrid has the ability to be internalized by an antigen-presenting cell, where the 
hybrid is subsequently processed and an antigenic segment of the hybrid presented on the surface of the 
antigen-presenting cell, where the segment elicits an immune response by cytotoxic T lymphocytes. 
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BACKGROUND OF THE INVENTION 

The numerous substances and organisms that threaten the existence of animals having immune 
systems are either present in extracellular body fluids, such as toxins or bacteria, or else they are harbored 
5 within the animal's own cells, such as viruses, certain parasites and oncogene products. This distinction is 
important to thymus-derived lymphocytes, also known as T cells, which are an important component of 
vertebrate immune systems. T cells have evolved parallel systems for recognizing intracellular and 
extracellular antigens. In both systems, antigens are recognized only when they are bound to molecules of 
the major histocompatability complex (MHC). 
io The MHC encodes two types of cell surface molecules that act as receptors for protein antigens. Class I 
MHC molecules consist of a highly polymorphic integral membrane glycoprotein alpha chain that is 
noncovalently bound to a beta 2 microglobulin. Class II MHC molecules consist of two noncovalently bound, 
highly polymorphic, integral membrane glycoproteins. Class I MHC molecules have a groove at the top 
surface formed by the two amino-terminal domains. The groove holds an antigen. As with other cell surface 
75 proteins, during cellular processing in the cytosol, MHC molecules are inserted into the endoplasmic 
reticulum (ER) and, following chain assembly, are transported to the plasma membrane of the cell via the 
Golgi complex and post-Golgi complex vesicles. 

The recognition of Class I vs. Class II molecules as antigen-presenting sites in general divides T cells 
into two classes, respectively termed cytotoxic T cells <T C ) and helper T cells (T H ). T c cells directly lyse 
20 cells that are infected with viruses or certain parasites and also will secrete cytokines such as gamma- 
interferon in order to eradicate intracellular pathogens and tumors. 

Virtually all cell types can serve as antigen-presenting cells for T c cells as long as they express MHC 
Class I molecules. In general, T c cells require antigen-presenting cells that are actively biosynthesizing 
antigen. During processing, the antigen is bound to a nascent Class I molecule in the ER and transported to 
25 the plasma membrane via the Golgi complex and post-Golgi complex vesicles. At the plasma membrane, 
the processed antigen sits in the groove of the MHC Class I molecule, where the processed antigen is 
available for binding to cell surface receptors of T c cells. Activation of T c cells requires interaction between 
multiple T c cell surface molecules and their respective ligands on antigen-presenting cells. Once activation 
has taken place, the lysing and cytokine secretion activity described above can begin. 
30 Antigen processing is the structural modification and trafficking, within the proper subcellular compart- 

ments, of protein antigens that enable the determinants recognized by T c cells to interact with MHC 
molecules. As noted above, most, and possibly all,' somatic cells expressing MHC Class I molecules 
constitutively process antigens and transport determinants to the cell surface for T c cell recognition. Antigen 
processing is thus required for the presentation of intact, folded proteins to T c cells. Commonly, antigen 
35 processing entails the generation of short peptides by cellular proteases, although some intact proteins 
productively associate with MHC molecules, indicating that proteolysis is not necessarily a component of 
antigen processing. 

Two distinct pathways are used by cells to process antigens. The endosomal pathway is so named 
because it is accessed through the endosomal compartment. Determinants produced by this pathway 

40 usually associate with Class II MHC molecules. The other pathway is the cytosolic pathway. The cytosolic 
pathway is so named because it can be accessed from the cytosol of the cell by the synthesis of proteins 
within the cell, or by penetration of plasma or endosomal membranes by extracellular proteins. Such 
penetration may occur naturally through the fusion of the cell's membrane with a virus, or artificially by 
osmotic lysis of antigen-containing pinosomes. Determinants produced by cytosolic processing typically 

45 associate with Class I MHC molecules. The cytosolic pathway is able to process many different types of 
foreign proteins for presentation to T c cells. 

Class I MHC molecules associate with antigens in a compartment of the ER. In this regard, it is 
important to note that the compound Brefeldin A acts by interfering with the normal vesicular traffic between 
the ER and the Golgi apparatus, and thus also has the effect of blocking the presentation of cytosolically 

so processed antigen on the surface of what would otherwise be an antigen-presenting cell. 

It can be seen from the above discussion that, in order to generate response by a cytotoxic T cell, it is 
generally necessary either to cause the target cell, which has been chosen as an antigen-presenting cell, to 
endogenously synthesize the protein antigen of interest, or to deliver exogenous protein antigen of interest 
directly into the cytosolic antigen processing pathway of the target cell. If the latter could be accomplished, 

55 a vaccine could be produced which would elicit cytotoxic T cells capable of killing virally or parasitically 
infected cells or tumor cells, thereby having particular usefulness for preventing three clinical types of 
diseases. 
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First, such vaccines could prevent infections caused by viruses such as papilloma or herpes virus which 
do not undergo a blood-borne phase of infection. This would be especially true in the case of human 
papilloma virus E7 protein, which is continuously cellularly expressed in the transformed phenotype, and 
would thus be particularly well suited to attack by sensitized cytotoxic T lymphocytes. 

5 Secondly, there are those infections caused by viruses such as influenza or human immunodeficiency 

virus (HIV) or parasites whose outer proteins may have high antigenic variability making it difficult to design 
a vaccine capable of eliciting protective titers of high affinity antibodies with broad specificity. Certain viral 
internal proteins have less antigenic variation, and peptides derived from such proteins when associated 
with Class I MHC molecules, would render infected cells susceptible to lysis by sensitized cytotoxic T 

10 lymphocytes. 

Thirdly, tumors and virally transformed cells express neoantigens that may be presented on Class I 
MHC molecules, thus rendering these cells suitable targets for cytotoxic T lymphocyte lysis. 

Current vaccines generally focus on generating humoral (that is, antibody) responses of the immune 
system, rather than the cellular immune responses discussed above. Those that do generate cellular 

75 immune responses use attenuated live viruses which replicate intracellular^, introducing their constituents 
into an infected cell's antigen processing pathway as a result of being synthesized within the cell thereby 
being available for the appropriate protein processing pathway. Thus, there is a need for a non-replicating 
vaccine that will sensitize cytotoxic T lymphocytes to produce a cellular immune response with a 
significantly greater margin of safety. 

20 The present invention meets this need by capitalizing on the ability of certain bacterial exotoxins to be 

internalized into cells through endocytosis via receptors on the cell surface and then translocate out of the 
resultant endosomes into the cellular compartment in which endogenous proteins are processed for 
presentation. These exotoxins have been hybridized with polypeptide or protein antigens, which are carried 
into the cytoplasm and are processed to peptides capable of association with Class I MHC molecules via 

25 the physiologic processes discussed above. Once associated with a Class I MHC molecule and presented 
on the surface of the antigen-presenting cell, they can sensitize cytotoxic T lymphocytes against other 
infected cells synthesizing the same polypeptide or protein. By virtue of these actions, the invention 
presents vaccines which can be effective in prophylaxis against viruses, parasites and malignancies. 

It is an additional object of the present invention to produce hybrid proteins of certain bacterial 

30 exotoxins having translocation domains, hybridized with polypeptides or proteins selected for their antigenic 
activity, which hybrids will be useful as probes for studying the intracellular processing and subsequent 
presentation of endogenously synthesized cytoplasmic proteins. 

BRIEF DESCRIPTION OF THE DRAWINGS 

35 

Figure 1 shows the structural domains of Pseudomonas exotoxin, along with the numbers of the amino 
acid residues that define the known limits of the structural domains. Amino acid residues are numbered as 
defined in Gray, et al, PNAS USA 811 = 2645-2649(1984). 

Figure 2 is a restriction map for plasmid pVC45-DF + T. 
40 Figure 3 is a restriction map for plasmid pBluescript II SK. 

Figure 4 is a restriction map for plasmid pBR322. 

Figure 5 is a graph showing the results of using hybrid construct PEMa in immunologically sensitizing 
U-2 OS cells, a human celt line. 

Figure 6 shows that a hybrid protein made of the binding and translocating domains of Pseudomonas 
45 exotoxin and a peptide epitope of influenza A matrix protein can competitively prevent the intact 
Pseudomonas exotoxin from binding to and killing target cells. 



SUMMARY OF THE INVENTION 

so The invention is a hybrid protein of two species, the first species being a modified bacterial toxin that 

has a translocating domain. The second species is a polypeptide or protein. The polypeptide or protein is 
exogenous to an antigen-presenting cell of interest. The hybrid of the bacterial toxin and the exogenous 
polypeptide or protein are constructed in such a way as to be capable of eliciting an immune response by 
cytotoxic T lymphocytes. 

55 A preferred bacterial toxin is a modified Pseudomonas exotoxin. Pseudomonas exotoxin is known to 

consist of four structural domains, namely la, II, lb and III. This is shown at Figure 1 , along with the numbers 
of the amino acid residues that define the known limits of the structural domains. More preferably, the 
Pseudomonas exotoxin is modified by deletion of structural domain III, that is the ADP-ribosylating structural 
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domain, although alternatively domain II! need not be entirely deleted, but may rather be sufficiently altered 
in its amino acid sequence so as to render it enzymatically nonfunctional as an ADP-ribosylating enzyme. 
Most preferably, the modified bacterial toxin has only a cellular recognition domain and a translocating 
domain, (with or without the 5 C-terminal amino acids of Domain lit added to the C-terminus of the 

5 polypeptide or protein antigen), or even just the translocating domain with or without targeting ligand. In the 
case of Pseudomonas exotoxin, the cellular recognition domain and translocating domain are known to exist 
within structural domains la, II and lb. Also most preferably, modified Pseudomonas exotoxins are arranged 
on the amino-terminal side of the hybrid, while the exogenous polypeptide or protein is arranged on the 
carboxyl-terminal side of the hybrid. 

70 The exogenous polypeptide or protein, which is exogenous to an antigen-presenting cell of interest, is 
preferably a polypeptide or protein of viral origin. More preferably, the viral polypeptide is a viral protein 
fragment, and most preferably is taken from the group comprising the matrix protein of influenza A virus; 
residues 57 to 68 of the matrix protein of influenza A virus (the matix epitope known to bind MHC HLA-A2); 
the nucleoprotein of influenza A virus; or the GAG protein of human immunodeficiency virus-1. 

75 Functionally, the hybrid is capable of eliciting an immune response by cytotoxic T lymphocytes, by 

virtue of being at least partially presented on an antigen-presenting cell surface. More specifically, the 
hybrid functionally is capable of being internalized by an antigen-presenting cell and further capable of 
being processed, via the endogenous protein processing pathway, on its way to at least partial presentation 
on the surface of the antigen-presenting cell. 

20 The hybrid proteins preferably will use polypeptide or protein antigens for use as a vaccine, and most 
preferably will use viral antigens. Most preferably, these viral antigens will be conserved viral proteins. The 
hybrids will be incorporated in an amount sufficient to elicit an immune response by cytotoxic T 
lymphocytes into vaccines further comprising pharmaceutical^ acceptable carriers. The vaccines will be 
sufficient to immunize a host against the diseases influenza, acquired immunodeficiency syndrome, human 

25 papilloma virus, cytomegalovirus, Epstein-Barr virus, Rota virus, and respiratory syncytial virus, tumors and 
parasites. 

The present invention further relates to recombinant DNA segments containing nucleotide sequences 
coding for the fused proteins described above, as well as plasmids and transformants harboring such 
recombinant DNA segments, as well as methods of producing the hybrid proteins using such recombinant 
30 DNA segments and methods of administration of the hybrid proteins as vaccines to hosts. 

DETAILED DESCRIPTION OF THE INVENTION 

The term "translocating domain" shall mean a sequence of amino acid residues sufficient to confer on 
35 a polypeptide or protein the ability to translocate across a cell membrane into a cellular compartment for 
processing endogenous proteins. 

The term "exogenous to an antigen-presenting cell" shall mean polypeptides that are not encoded by 
the unmutated genome of a given antigen-presenting celt. 

The term "antigen-presenting cell" shall refer to a variety of cell types which carry antigen in a form 
40 that can stimulate cytotoxic T lymphocytes to an immunologic response. 

The term "immune response" shall mean those cytotoxic processes of cell lysis and cytokine release 
engaged in by cytotoxic T lymphocytes that have been stimulated by antigen presented by an antigen- 
presenting cell. This term shall also include the ability of a host's cytotoxic T lymphocytes to retain their 
cytotoxic response to subsequent exposure to the same antigen that will lead to more rapid elimination of 
45 the antigen than in a non-immune state. 

The term "presented on an antigen-presenting cell surface" shall mean that process by which an 
antigen is seated within a ligand site of a major histocompatability complex Class I protein on the surface of 
an antigen-presenting cell. 

The term "being internalized by an antigen-presenting cell" shall mean the process of endocytosis 
so resulting in endosome formation. 

The term "cellular recognition domain" shall mean a sequence of amino acid residues in a polypeptide 
sufficient to confer on that polypeptide the ability to recognize a receptor site on the surface of a target cell. 

The term "ADP ribosylating domain" shall mean a sequence of amino acids sufficient to confer on a 
polypeptide the ability to modify elongation factor II within a cell, and thereby severly impair the viability of 
55 the cell or kill it. 

The term "vaccine" shall mean a pharmaceutical^ acceptable suspension of a given therapeutic entity 
administered for the prevention, amelioration or treatment of infectious diseases. 
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The term "conserved viral protein" shall mean those viral proteins that do not vary from strain to strain 
of a given species Of virus, or to those viral proteins that are generally unlikely to undergo mutation as a 
function of time in a given strain. 

The term "arranged on the amino terminal side of said hybrid" shall mean that a peptide sequence has 
5 been inserted at any point between the amino terminus of a hybrid and the hybrid's middle amino acid 
residue. 

The term "arranged on the carboxy terminal side of said hybrid" shall mean that a peptide sequence 
has been inserted at any point between the carboxy terminus of a hybrid and the hybrid's middle amino 
acid residue. 

70 The hybrid proteins of the present invention are fusion protein constructs of a bacterial toxin having a 
translocating domain fused to a polypeptide or protein that has been selected for its antigenicity for a given 
disease, as well as for being exogenous to a targeted antigen-presenting cell. A preferred bacterial toxin is 
the Pseudomonas exotoxin. This exotoxin is known to comprise four structural domains, as shown in Figure 
1. These domains are designated la, II, lb and III. Structural domain la is known to be necessary for binding 

75 of the exotoxin to a receptor site on the surface of a target cell. Structural domain II is known to be 
necessary for translocation of the exotoxin across an internal membrane the targeted cell. Part of structural 
III are known to be an ADP ribosylating enzyme that bind to the protein Elongation Factor 2, which generally 
results in the death of the target ceil. 

In a preferred embodiment of the present invention, structural domain III (or all domain III except for the 

20 C-terminal amino acids) has been deleted from the Pseudomonas exotoxin molecule, and has been 
replaced with one of several polypeptides or proteins chosen for their ability to act as antigens and 
therefore be useful as vaccines. The antigens used for vaccines include antigens of viruses whose hosts are 
higher vertebrates, such as antigen of influenza A virus, human immunodeficiency virus-1 , human papilloma 
virus, cytomegalovirus, Epstein-Barr virus, Rota virus, and respiratory syncytial virus. Other viruses include 

25 herpes viruses such as herpes simplex virus, varicella-zoster virus, adult T cell leukemia virus, hepatitis B 
virus, hepatitis A virus, parvoviruses, papovaviruses, adenoviruses, pox viruses, reoviruses, paramyx- 
oviruses, rhabdoviruses, arena-viruses, and coronaviruses. Other disease states can have antigens designed 
for them and used in alternative embodiments of the present invention, including antigens with pathogenic 
protozoa, such as malaria antigen. 

30 The fusion proteins of the present invention are preferably manufactured through expression of 
recombinant DNA sequences. 

The DNAs used in the practice of the invention may be natural or synthetic. The recombinant DNA 
segments containing the nucleotide sequences coding for the embodiments of the present invention can be 
prepared by the following general processes: 

35 (a) A desired truncated gene is cut out from a plasmid in which it has been cloned, or the gene can be 
chemically synthesized; 

(b) An appropriate linker is added thereto as needed, followed by construction of a fused gene; and 

(c) The resulting fused protein gene is ligated down stream from a suitable promoter in an expression 
vector. 

40 Techniques for cleaving and ligating DNA as used in the invention are generally well known to those of 
ordinary skill in the art and are described in Molecular Cloning, A Laboratory Manual, (1989) Sambrook, J., 
et al., Cold Spring Harbor Laboratory Press. 

As the promoter used in the present invention, any promoter is usable as long as the promoter is 
suitable for expression in the host used for the gene expression. The promoters can be prepared 
45 enzymatically from the corresponding genes, or can be chemically synthesized. 

Conditions for usage of all restriction enzymes were in accordance with those of the manufacturer, 
including instructions as to buffers and temperatures. The enzymes were obtained from New England 
Biolabs, Bethesda Research Laboratories (BRL), Boehringer Mannheim and Promega. 

Ligations of vector and insert DNA's were performed with T4 DNA ligase in 66mM Tris-HCI, 5mM 
so MgCI 2 , "ImMDTE, ImMATP, pH 7.5 at 15° C for up to 24 hours. In general, 1 to 200 ng of vector and 3-5x 
excess of insert DNA were preferred. 

Selection of E. coli containing recombinant plasmids involve streaking the bacteria onto appropriate 
antibiotic containing LB agar plates or culturing in shaker flasks in LB liquid (Tryptone 10g/L, yeast extract 
5g/L, NaCI 10g/L, pH 7.4) containing the appropriate antibiotic for selection when required. Choice of 
55 antibiotic for selection is determined by the resistance markers present on a given plasmid or vector. 
Preferably, vectors are selected by ampicillin. 

Culturing of E. coli involves growing in Erienmeyer flasks in LB supplemented with the appropriate 
antibiotic for selection in an incubation shaker at 250-300 rpm and 37 * C. Other temperature from 25°- 
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37* C could be utilized. When cells are grown for protein production, they are induced at Asbo - 1 with IPTG 
to a final concentration of 0.4 mM. Other cell densities in log phase growth can alternatively be chosen for 
induction. 

Harvesting involves recovery of E. coli cells by centrifugation. For protein production, cells are 
5 harvested 3 hours after induction though, other times of harvesting could be chosen. 

In the present invention, any vector, such as a plasmid, may be used as long as it can be replicated in 
a procaryotic or eucaryotic cell as a host. 

By using the vector containing the recombinant DNA thus constructed, the host cell is transformed via 
the introduction of the vector DNA. 
io The host cell of choice is BL21 (DE3) cells ( E. coli ), obtained from F. Wm. Studier, Brookhaven National 
Laboratories, Stony Brook, N.Y. Reference is also made to Wood, J. Mol. Biol., 16:118-133 (1966) U.S. 
Patent No. 4,952,496, and Studier, et al., J. Mol. Biol. 189:113-130 (1986). However, any strain of E. coli 
containing an IPTG inducible T7 polymerase gene would be suitable. For routine cloning, E. coli strain 
DH5a(BRL) can be used. 

75 BL21(DE3) strain of E. coli was acquired under license from W. F. Studier. Reference is made to 

Studier, W. F. et. al., Methods in Enzymology, Vol. 185, Ch. 6, pp 60-89 (1990). This strain is unique to the 
extent that it contains an inducible T7 polymerase gene. The strain has no amino acid, sugar or vitamin 
markers, so it can grow on any rich or defined bacterial medium. It can be grown between 25° C and 37' C. 
It needs aeration, and it needs IPTG for induction of the T7 polymerase. 

20 In the present invention, the fused proteins can be separated and purified by appropriate combinations 
of well-known separating and purifying methods. These methods include methods utilizing a solubility 
differential such as salt precipitation and solvent precipitation, methods mainly utilizing a difference in 
molecular weight such as dialysis, ultrafiltration, gel filtration and SDS-polyacrylamide gel electrophoresis, 
methods utilizing a difference in electric charge such as ion-exchange column chromatography, methods 

25 utilizing specific affinity such as affinity chromatography, methods utilizing a difference in hydrophobicity 
such as reverse-phase high pressure liquid chromatography, methods utilizing a difference in isoelectric 
point, such as isoetectrofusing electrophoresis, and methods using denaturation and reduction and re- 
naturation and oxidation. 

Preferred embodiments of the invention will now be described in detail in the following non-limiting 
30 examples. The most preferred embodiments of the invention are any or all of those specifically set forth in 
these examples. These examples are not, however, to be construed as forming the only genus that is 
considered as the invention, and any combination or sub-combination of the examples may themselves 
form a genus. These examples further illustrate details for the preparation of various embodiments of the 
present invention. Those skilled in the art will readily understand that known variations of the conditions and 
35 processes of the following preparative procedures can be used to prepare these embodiments. 

EXAMPLE 1 
BS-PEM1-2 

40 

A 1.3kb Nrul/Sacll fragment of plasmid pVC45-DF + T (Fig. 2) (obtained from Dr. Ira Pastan of the 
National Institute of Health) containing the domain I and H coding regions of Pseudomonas exotoxin (PE) 
(Sequence ID No. 1) was subcloned Into pBluescript II SK (Stratagene, Fig. 3) restricted with Hindi and 
Sad!. The resulting construct is designated BS-PE. The influenza M1 (M1) gene (Sequence ID No. 2 and 3) 

45 which codes for the matrix protein of influenza A virus was subcloned into BS-PE restricted with Sacll and 
Sac! by amplifying the M1 gene from pApr701 (P. Palase, Mt. Sinai Medical Center, New York, N.Y. pApr 
701 consists of the M1 gene cloned into the ECORI site of pBR322, shown at Fig. 4. Reference is made to 
Young, J.F. et. al, Expression of Influenza Virus Genes; The Origin of Pandemic Influenza Virus; 1983) by 
polymerase chain reaction (PGR) (Gene Amp® PCR Reagent Kit; Perkin Elmer Cetus, Norwalk, Conn. 

so 06859) with oligonucleotide primers which added a Sacll site adjacent to M1 codon number 2 (Sequence ID 
No. 4) and a Sacl site 3' of the M1 termination codon (Sequence ID No. 5). This plasmid is designated BS- 
PEM1-1. 

The truncated ompA leader coding sequence was removed from the 5* end of the fusion gene by 
replacing the small Xhol/Hindlll fragment of BS-PEM1-1 with the oligonucleotide sequence shown in 
55 Sequence ID No. 6. The resulting plasmid is named BS-PEM1-2 and encodes a fusion gene consisting of 
Pseudomonas exotoxin amino acids 2 through 414 joined to M1 amino acids 2 to 252 (Sequence ID No. 7 
and 8). 
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EXAMPLE 2 
pVC-ompA-PEM1-2 

5 pVC45DF + T vector was prepared by restriction digestion with Hindlll and EcoRI, followed by gel 

purification. 

The PEM1 insert fragment was prepared by restriction digestion of BS-PEM1-1 with Sacl, followed by 
T4 DNA polymerase treatment to remove the 3* overhang. EcoRI linkers were added to the blunted Sacl 
site, followed by restriction digestion with Hindlll. The Hindlll-EcoRI fragment was gel purified (Molecular 
ro Cloning Manual, Gene Clean Kit, Bio 101, Inc. P.O. Box 2284, La Jolla, CA 92038) and ligated into the 
prepared pVC45-DF + T vector. The resulting construct was named pVC-ompA-PEM1-2. 

The ompA signal sequence was removed from the construct by restriction digestion of pVC-ompA- 
PEM1-2 with Xbal and Hindlll. An oligonucleotide fragment containing the T7 promoter, ribosome binding 
site and initiation sequence was ligated into the vector whose base sequence is shown at Sequence ID No. 
75 9. The resulting pfasmid construct was named pVC-PEM1-2 and encodes a T7 polymerase-driven gene 
fusion consisting of PE amino acids 2 through 414 joined to influenza M1 amino acids 2 through 252. The 5' 
and 3' ends of the coding region, as well as the PE to M1 fusion site and cytotoxic T lymphocyte epitope 
coding sequences (Rotzschke, O. et. al., Nature 348, 252 (1990) were confirmed by DNA sequencing. 

20 EXAMPLE 3 
BS-PEMa 

The influenza Ma sequence (coding for residues 57-68 of the influenza matrix protein) was obtained by 
25 amplifying a portion of the influenza M1 gene in pApr701 by polymerase chain reaction (PCR) with 
oligonucleotide primers which added a Sacll site adjacent to influenza M1 codon No. 57 (Sequence ID No. 
10) and a termination codon and a Sacl site 3' of the M1 codon No. 68 (Sequence ID No. 11). This fragment 
was cut with Sacll and Sacl and subcloned into BS-PE digested with Sacll and Sacl. The resulting plasmid 
is named BS-PEMa-1 and was verified by sequencing through the junctions and the Ma sequence itself. 

30 

EXAMPLE 4 

Subcloning of PEMa from BS-PEMa1 into PVC45DF + T 

35 The PEMa insert (Sequence ID No. 12) was prepared by restricting BS-PEMa-1 with Sacl and removing 

the 3' overhang by treatment with T4 DNA polymerase, then restricting with Apal and gel purifying. 

pVC45DF + T was restricted with EcoRI and the 5' overhang filled in with Klenow enzyme treatment 
(Molecular Cloning Manual, ibid.), it was subsequently restricted with Apal and gel purified. The vector and 
fragment were ligated together, and the resulting construction was named pVC-ompA-PEMa-1 . The 

40 construction was verified by sequencing across the junctions and through Ma. 

The ompA leader sequence was removed from pVC-ompA-PEMa-1 by digestion with Xbal and Hindlll. 
An oligonucleotide fragment containing the T7 promoter, ribosome binding site, initiation sequence and a 
build-back of the 5* end of the PE coding region (Sequence ID No. 13) was ligated to the vector. The 
resulting construction was named pVC-PEMa-1 and encodes a T7 polymerase driven gene fusion consisting 

45 of PE amino acids 2 to 414 joined to influenza M1 amino acids 57 to 68 (Ma) Sequence ID No. 14 and 15. 
The 5' end of pVC-PEMa-1 was verified by sequencing through the oligonucleotide fragment. 

EXAMPLE 5 
50 Construction of pVC-PEBT 

A control plasmid was constructed which encodes a T7 polymerase driven gene fusion consisting of PE 
amino acids 2 to 414 followed by termination codons. pVC-PEMl-2 was digested with Sacll and EcoRI to 
remove the M1 sequence. The vector was gel purified and ligated to an oligonucleotide that builds back PE 
55 codon No. 414 followed by termination signals shown in Sequence ID No. 16. The resulting construction 
was named pVC-PEBT (Sequence ID No. 17 and 18) and was verified by sequencing across the junctions 
and the oligonucleotide addition. 
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EXAMPLE 6 
BSK-PEM1 

s BSK-PEM1 was made from BS-PEM1 by the replacement of the 21 base pair Xhol/Hindlll fragment with 

a 24 base pair fragment encoding a consensus eucaryotic ribosome binding site (Sequence ID No. 19). The 
purpose of the construct was to increase the yields of in vitro translated PEM1 protein. Thus, an additional 
object of the invention is to increase yields of translated PEM1 protein. 

70 EXAMPLE 7 

pVCPE/2 (pVC45DF + T/2) 

pVCPE/2 was made by replacing the 105 base pair PpuMI/EcoRI fragment of pVC45DF + T with a 46 
75 base pair DNA fragment encoding an inframe duplication of PE codons 604 to 613 flanked by unique 
cloning sites (Sequence ID No. 20). This construct is used for generating full-length molecules of PE with 
the deletion of residue 553 resulting in an inactivated toxin domain (Sequence ID No. 21 and 22) fused to 
protein segments of choice between PE codons 604 and 605. One may replace the ompA signal sequence 
with the promoter/ribosome binding site as described for PVC-PEM1-2. 

20 

EXAMPLE 8 
pVCPE/2-Ma 

25 pVCPE/2-Ma was made by ligating into the Xmal site of pVCPE/2 a 48 base pair DNA fragment 

encoding amino acids 55 through 67 (Sequence ID No. 23). This construct expresses in E. coli full-length 
PE with M1 amino acids 55 through 67 inserted between PE amino acid 604 and 605 (Sequence ID No. 24 
and 25). One may replace the ompA signal sequence with the promoter/ribosome binding site as described 
for pVC-PEM1-2. 

30 

EXAMPLE 9 
pVCPE/2-M1:15-106 

35 pVCPE/2-M1:15-l06 was made by subcloning a PCR-amplified DNA fragment encoding M1 amino 

acids 15 through 106 into the Xmal site of pVCPE/2. The sequence of the oligonucleotide primers used to 
amplify the M1 segment are those shown at Sequence ID No. 26 and 27, respectively. This construct 
expresses in E. coli full length PE with M1 amino acids 15 through 106 inserted between PE amino acid 
604 and 605 (Sequence ID No. 28 and 29). One may replace the ompA signal sequence with the 

40 promoter/ribosome binding site as described for pVC-PEM1-2. 

EXAMPLE 10 
pVCPEdel (403-61 3) 

45 

pVCPEdel (403-61 3) was made by restricting pVC45DF + T with Sacll followed by elimination of the 3' 
Sacll overhang with T4 DNA polymerase and the ligation of a 3-frame termination linker whose nucleic acid 
sequence is given at Sequence ID No. 30. This construct will express FE domains I, II and lb only, fused to 
the ompA leader in E. coli. 

50 

EXAMPLE 11 
pVCPEdel (403-505) 

55 pVCPEdel (403-505) was made by restricting pVC45DF + T with Sacll and Xhol followed by removal of 

restriction overhangs with mung bean nuclease (New England Biolabs). The vector fragment was recovered 
and reclosed with DNA ligase. This construct will express in E. coli the PE protein lacking amino acids 403 
through 505. 



9 




EP 0 532 090 A2 



EXAMPLE 12 
pVCPEdel (494-505) 

s pVCPEdel (494-505) was made by restricting pVC45DF + T with BamHI and Xhol followed by the filling 

in of the 5' overhangs with Klenow fragment- The vector fragment was recovered and reclosed with DNA 
ligase. This construct will express in E. coli the PE protein lacking amino acids 494 through 505. 



EXAMPLE 13 

w 

pVCPEdel (494-610) 

pVCPEdel (494-610) was made by restricting PVC45DF + T with BamHI and PpuMI followed by the 
filling in of the 5' overhangs with Klenow fragment. The vector fragment was recovered and reclosed with 
75 DNA ligase. This construct will express in E. coli the PE protein lacking amino acids 494 through 610. All of 
the pVCPEdel plasmids were useful in determining to what extent the toxin domain of PE could be 
truncated without resulting in the expression of an insoluble protein in E. coli. It thus became an additional 
object of the invention to provide hybrids having the minimal toxin domain of PE that would retain water 
solubility. 

20 

EXAMPLE 14 

Addition of Sequences Between pE and M1 in pVC-PEM1-2 

25 Oligonucleotide linkers can be added at the Sacll site between PE and M1 in pVC-PEM-2. These linkers 

can be designed to add cleavage sites and/or signal sequences which can help the M1 portion of the fusion 
protein to become available for presentation within the cell. Sacll digestion cleaves the gene between the 
last two PE codons (for amino acids 413 and 414) and provides an appropriate site for such additions. 

The following four constructions have been made by inserting linkers at the Sacll site. The constructions 

30 have been verified by sequencing across the Sacll junctions and through the complete linker. 

EXAMPLE 15 
pVC-PE-RK-M1 

35 

This vector contains an ARG LYS(RK) cleavage site inserted into the Sacll site, using an oligonucleotide 
linker as shown in Sequence ID No. 31. The resulting amino acid sequence between amino acids 413 and 
414 of PE is Gly Gly Arg Lys Ser. 

40 EXAMPLE 16 

pVC-PE-RKSigl-M1 

This vector contains an ARG LYS(RK) cleavage site and the signal sequence that is shown in Sequence 
45 ID No. 32 from the Influenza A hemagglutinin (HA) protein inserted at the Sacll site, using the 
oligonucleotide linker disclosed at Sequence ID No. 33. The resulting amino acid sequence between amino 
acids 413 and 414 of PE is also as shown in Sequence ID No. 34. 

EXAMPLE 17 

50 

PVC-PE-Sig1-M1 

This vector contains the signal sequence of HA without the RK cleavage site inserted into the Sacll site 
using the oligonucleotide linker shown at Sequence ID No. 35. The resulting amino acid sequence between 
55 amino acids 413 and 414 of PE is also as shown at Sequence ID No. 36. 
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EXAMPLE 18 
pVC-PE-Sig2-M1 

s This vector contains the signal sequence shown at Sequence ID No. 37, derived from amino acids 22 to 

48 from ovalbumin inserted into the Sacll site, using the oligonucleotide linker of Sequence ID No. 38. The 
resulting amino acid sequence between amino acids 413 and 414 of PE is also as that shown in Sequence 
ID No. 39. 

70 Addition of Sequences Between PE and Ma In pVC-PEMa-1 

Oligonucleotide linkers can be added at the Sacll site between PE and Ma in pVC-PEMa-1. These 
linkers can be designed to add cleavage sites and/or signal sequences which can help the Ma peptide to 
become available for presentation within the cell. Sacll digestion cleaves the gene between the last two PE 
75 * codons (for amino acids 413 and 414) and thus provides an appropriate site for such additions. 

The following four examples have been made by inserting linkers at the Sacll site. The constructions 
have been verified by sequencing across the Sacll junctions and through the complete linker. 

EXAMPLE 19 

20 

pVC-PE-RKSig1-Ma 

This vector contains an ARG LYS (RK) cleavage site and the signal sequence from the Influenza A 
hemagglutimin (HA) protein inserted into a blunted Sacll site, using the oligonucleotide linker shown at 
25 Sequence ID No. 40. The resulting amino acid sequence between amino acids 413 and 414 of PE exotoxin 
is also as shown at Sequence ID No. 41 . 

EXAMPLE 20 

30 pVC-PE-Sig1-Ma 

This vector contains the single sequence of HA without a cleavage site inserted into a blunted Sacll site 
using the oligonucleotide linkers shown in Sequence ID No. 42. The resulting amino acid sequence between 
amino acids 413 and 414 of PE is also as shown in Sequence ID No. 43. 

35 

EXAMPLE 21 
pVC-PE-Sig2-Ma 

ao This vector contains a signal sequence derived from amino acids 22 through 48 from ovalbumin 
inserted into a blunted Sacll site, using the oligonucleotide linker as seen in Sequence ID No. 44, The 
resulting amino acid sequence between amino acids 413 and 414 of FE is also as shown in Sequence ID 
No. 45. 

45 EXAMPLE 22 

pVC-PE-SiglSig2-MA 

This vector contains the signal sequence derived from HA, followed by the signal sequence from 
50 ovalbumin inserted into the Sacll site, using the oligonucleotide linker shown at Sequence ID No. 46. The 
resulting amino acid sequence between amino acids 413 and 414 of PE is also as shown at Sequence ID 
No. 47. 
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EXAMPLE 23 
BSPEMlcSaa 

5 The plasmid BSPEM1-2 was digested with Sacl and Stul and ligated to the oligonucleotide linker shown 

at Sequence No. 48. This linker builds back the C-terminus of the M1 protein and adds the last five amino 
acids from the C-terminus of the PE protein, whose sequence is Arg Glu Asp Leu Lys, followed by a 
termination codon. This also incorporates an EcoRI site. The resulting plasmid was named BSPEMlcSaa 
and was sequenced across the junctions (Sequence ID No. 49 and 50) and the linker for verification of the 

70 construction. 

EXAMPLE 24 
P VC-PEM1c5aa 

75 

The plasmid BSPEMlcSaa was digested with Hindlll and EcoRI and 1.8 kb PEM1c5aa fragment was gel 
purified. The plasmid pVC-PEM1-2 was digested with Hindlll and EcoRI and the 3.2 kb vector fragment was 
ligated to the 1.8 kb PEM1c5aa fragment and the resulting plasmid was named pVC-PEM1c5aa. The 5' and 
3' ends of the PEM1c5aa insert were verified by sequencing. 

20 

EXAMPLE 25 
pVC-PENPc5aa 

25 A fragment containing the nucleoprotein (NP) of Influenza A virus was obtained from plasmid pApr501 

(obtained from Peter Palase, Mt. Sinai Medical Center, New York, N.Y. pApr501 is said nucleoprotein gene 
cloned into the EcoRI site of pBR322, (Fig. 4) by polymerase chain reaction with oligonucleotide primers 
which added a Sacll site adjacent to the ATG codon of NP to give the sequence shown at Sequence ID No. 
51, and the last 5 amino acids of FE followed by a termination codon and an EcoRI site to the 3' end of NP 

30 to give the sequence shown at Sequence ID No. 52. The polymerase chain reaction fragment was digested 
with Sacll and EcoRI and ligated to the plasmid pVC-PEM1-2 digested with Sacll and EcoRI. The resulting 
plasmid is named pVC-PENPc5aa. The 5' and 3* ends of the PENPcSaa insert (Sequence ID No. 53 and 
54) were verified by sequencing. This construction fuses the binding and translocation domains of PE to the 
Influenza A nucleoprotein. 

35 

EXAMPLE 26 
pVC-ompA-PEGAG 

40 The HIV GAG gene was obtained from plasmid HIVpBR322 (obtained from Ron Diehl Merck, Sharpe 

and Dohme Research Laboratories, West Point, PA., Fig. 5) by polymerase chain reaction with 
oligonucleotides that added a Sacll site adjacent to the ATG codon of GAG to give the nucleotide sequence 
shown at Sequence ID No. 55, and a Sacl site immediately after the termination codon at the 3' end to give 
the nucleotide sequence at Sequence ID No. 56, The polymerase chain reaction fragment was digested with 

45 Sacll and ligated to plasmid pVC45DF + T, which had been digested with EcoRI, the 5* overhang filled in by 
Klenow fragment, and digested with Sacll. The resulting plasmid was named pVC-ompA-PEGAG (Sequence 
ID No. 57 and 58) and was verified by a partial sequence at the Sacll junction. 

This construction fused the binding and translocation domains of FE to the GAG gene of HIV-1 virus. The 
fusion protein contains an ompA leader sequence. Alternatively, any vector containing the complete coding 
so region for HIV GAG can be used with these oligomers to generate the HIV GAG gene by PCR. 

EXAMPLE 27 

Expression of PEM1 , PEMa and PEBT 

55 

Frozen competent BL21(DE3) cells (as described by Studier, et al. Mol. Biol., 189, 113-130, 1986) were 
prepared as described (DNA cloning, Vol. 1, p. 121, Ed. D N Glover, IRL Press, Wash., D.C.). 
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BL21(DE3) cells were transformed with pVC-PEM1-2, pVC-PEMa-1. or pVC-PEBT as described below 
(this can be performed with pVC-PE fusion plasmids in general) and transformants were selected on L-Amp 
plates. Fresh transformants were used to inoculate L-Amp liquid cultures at A560 = 0.1. Cultures were grown 
at 37 *C with vigorous aeration and induced at A560 = 1.0 with IPTG to a final concentration of 0.4 mM. 
5 Cultures were harvested after 3 hours of induction and the cell pellets used for protein extraction and 
purification (Protein Structure: A Practical Approach, T.E. Creighton, ed., IRL Press at Oxford Univ. Press. 
Ch. 9, 191 (1989)). 

Transformation Procedure 

70 

A bath of dry ice/ethanol was prepared and maintained at -70 *C. Competent cells were removed from a 
-70 *C freezer and thawed on ice. A sufficient number of 17 x 100 mm polypropylene tubes (Falcon 2059) 
were placed on ice. 100 ul aliquots of gently mixed cells were prepared in the chilled polypropylene tubes. 
DNA was added by moving a pipette through the cells while dispensing; the cells were then gently shaken 
75 for 5 seconds after addition. The cells were incubated on ice for 30 minutes, then heat-shocked in a 42' C 
water bath for 45 seconds without shaking. The cells were again placed on ice for 2 minutes. 0.9 ml of 
S.O.C. reagent (Bactotryptone 2%, Yeast Extract 0.5%, NaCI 10mM, KCI 2.5mM, MgCfe'MgSO* 20mM, 
Glucose 20mM and distilled water, up to 100 ml) was added and the mixture shaken for 1 hour at 225 rpm 
and 37 * C, then plated on antibiotic plates, spread gently. 

20 

EXAMPLE 28 

Incubation of U-2 OS Cells With 51 Cr and Protein/PEMa 

25 U-2 OS cells (ATCC) were harvested from flasks, after a 1X wash with RCM 8, using imM EDTA. The 

flasks were incubated at 37 °C for 10 minutes, until cells were nonadherent. Five mi. of U-2 OS medium 
[McCoy's 5A (GIBCO) supplemented with 15% fetal bovine serum (HyClone) and penicillin 100 U/ml. and 
streptomycin 100 ug/ml (GIBCO)] was added, and the cells were centrifuged for 10 minutes at 210 x g. 

Cells were resuspended in U-2 OS medium at 8.5 x 10s/mL To each well of a 12-well plate, 0.7 ml of 

30 cell suspension was added. Negative controls include U-2 OS medium alone and PEBT. The positive 
control for sensitization of U-2 OS cells is KKAM1 (2 ug/ml), from M. Gammon and H. Zweerink (Merck, 
Sharp and Dohme Research Laboratories, Rahway, NJ). PEMa was added at 0.2uM or greater welt 
concentration. Simultaneously, 137.5 uCi of 51 Cr (Amersham) was added to each well. Medium was added 
to all wells to bring the total volume to 1 ml. This was placed at 37' C, 5.5% CO2 for 14 hours. 

35 

EXAMPLE 29 

Assay Protocol for CTL Activity Against Sensitized U-2 OS Targets 

40 After the 14 hour incubation, U-2 OS were removed, after a 1X RCM 8 wash using 1mM EDTA. Plates 

were incubated at 37* C for 10 minutes until cells were nonadherent. K medium [RPM1 1640 (GIBCO) 
supplemented with 10% fetal bovine serum (HyClone), 10 mM HEPES (GIBCO), 2 mM L-glutamine 
(GIBCO), penicillin 100 U/ml and streptomycin 100 ug/ml (GIBCO), and 50 um 2-mercaptoethanol (Bio- 
Rad)] was added to give a total volume of 10 ml; cells were centrifuged for 10 minutes at 210 x g. The cells 

45 were incubated at room temperature for 10 minutes in 10 ml of K medium before entering the second 
centrifugation. The cells were then resuspended in 1 ml of K medium, counted, and resuspended to 1 x 
10 5 /ml in K medium. 

Human cytotoxic T lymphocytes, generated from one donor, were harvested, centrifuged for 10 minutes 
at 92 x g, and resuspended in K medium at 2.5 x 10 6 /ml. 

50 100 ul of human CTLs were added to each well of a 96-well U-bottom microtiter plate (CoStar). 100 ul 
of the U-2 OS 51 Cr-labeled targets were also added to these wells for a final effector/target ratio of 25:1. 
Spontaneous 51 Cr release was determined by incubating U-2 OS cells with 100 ul of K medium alone. The 
maximal release was determined by adding 100 ul of 6 M HCI to 100 ul of targets. The plates were quickly 
centrifuged to bring down the cells, and incubated for 2 hours at 37 ° C. 

55 After this 2 hour incubation, the plates were centrifuged for 5 minutes, 330 x g, 5*C; 30 ul of 

supernatant was harvested from each well onto a plastic-backed filtermat (Pharmacia/LKB). The mat was 
dried in the microwave for 3 minutes, on medium-high power. The mat was placed into a sample bag with 
10 ml of BetaPlate Scint, heat sealed and placed into the BetaPlate 1205 counter (Pharmacia/LKB). Results 
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were expressed as % specific lysis, defined as: 



% specific lysis= Experimenta l - Spontaneous x 100 
* Maximal-Spontaneous 

where 

Experimental = counts per minute from the 30 ul of supernatant harvested from the wells containing 
targets plus human cytotoxic T lymphocytes, as determined by a Betaplate 1205 counter; 
70 Spontaneous = counts per minute from the 30 ul of supernatant harvested from the wells containing targets 
plus medium alone, as determined by the BetaPlate 1205 counter; and 

Maximal = counts per minute from the 30 ul of supernatant harvested from the wells containing target plus 
6M HCI (Fisher Scientific), as determined by the BetaPlate 1205 counter. 

Results are presented graphically in Fig. 5, with U-2 OS medium alone and PEBT as negative controls, 
75 and KKAM1 as a positive control. Greater that 10% specific lysis is considered a positive response 
(Cerottini, et.al., J. Exp. Med. 140:703, 1974). 



EXAMPLE 30 



20 Generation of M1 -specific Human Cytotoxic T Lymphocytes 



Original stock of human cytotoxic T lymphocytes was derived by harvesting blood from one donor into 
a syringe (Becton Dickinson) containing 25 U of heparin for each ml of whole blood (Eikins-Sinn, Inc.). The 
heparinized blood was pipetted directly into a Leucoprep tube (Becton Dickinson) and centrifuged for 20 

25 minutes at 1700 X g. The buffy coat which was seen just above the interface was removed, centrifuged for 
10 minutes at 92 X g, and washed twice in RPMI 1640 (GIBCO). The peripheral blood mononuclear cells 
(PBLs) recovered from the Leucoprep procedure were resuspended in 10 ml of CTL medium [RPMI 1640 
(GIBCO) supplemented with 10% donor or pooled human plasma, 4 mM L-glutamine, 10 mM HEPES, 
penicillin 100 U/ml and streptomycin 100 ug/ml (GIBCO)] at 1 X 10 6 /ml. 

30 M1 peptide (received from M. Gammon and H. Zweerink, MSDRL, Rahway; 2 mg/ml stock) in DMSO 

was diluted 1:10 in RPMI 1640 (GIBCO). M1 peptide was added to the 10 ml of lymphocytes at a final 
concentration of 5 ug/ml. The cells were then plated at 1.5 X 10 6 /weli in 24-well plates (Nunc). 

Two U/ml of lnterleukin-2 ala-125 (Amgen) was added on Day 3. The cell density was adjusted to 1 X 
10 s /ml as needed, and the medium was supplemented with 2 U/ml additional lnterleukin-2 to compensate 

35 for the increase in volume. Cells were restimulated with peptide-pulsed peripheral blood lymphocytes every 
7 days as described below, lnterleukin-2 ala-125 (Amgen) was replenished every 3 days. 

Cytotoxic T lymphocytes and unstimulated PBLs were frozen (CryoMed) in a mixture of 70% RPMI 
1640 (GIBCO), 20% fetal bovine serum (HyClone), and 10% dimethyl sulfoxide (Sigma) and thawed as 
needed. 

40 

EXAMPLE 31 



Recovery and Restimulation of Frozen CTL's 



45 Cytotoxic T lymphocytes (CTL's) were thawed in a 37* water bath and then resuspended in 35 ml of 

CTL medium [RPMI 1640 (GIBCO) supplemented with 10% donor or pooled human plasma, 4 mM L- 
glutamine, 10 mM HEPES, penicillin 100 U/ml and streptomycin 100 ug/ml (GIBCO]. The cytotoxic T 
lymphocytes were then placed at 37°, 5% CO2 for 1 hour. The cell suspension was centrifuged for 10 
minutes at 92 X g. The cells were resuspended at 5 X 10 5 /ml in CTL medium. 

50 The source of stimulator cells for the freshly thawed cytotoxic T lymphocytes was freshly harvested 
PBL, which had been collected using the Leucoprep method described above. For peptide pulsing, an 
appropriate number (2 x 10 6 - 10 7 ) of PBL were centrifuged, the supernatant was aspirated, and KKAM1 at 
200 ug/ml in RPMI 1640 (GIBCO) plus 10% DMSO (Sigma) was added at the rate of 100 ul of KKAM1 for 
every 10 7 cells. The cells were incubated for 1 hour at 37*. 5% CO2. The peptide-pulsed peripheral blood 

55 lymphocytes were irradiated with 2,000 Rads using a 60 Co source. The cells were washed once in RPM1 
1640, centrifuged for 10 minutes at 92 X g, and resuspended in CTL medium at 1 X 10 s /ml. 

Equal volumes of cytotoxic T lymphocytes and irradiated, peptide-pulsed peripheral blood lympocytes 
were mixed together for a final ratio of 1 CTL:2 peptide-pulsed PBL. lnterleukin-2 ala-125 (Amgen) was 
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added at a final concentration of 2 U/ml. The cells were thoroughly mixed together with the lnterleukin-2 ala- 
125 and 1.2 ml was plated into each well of a 48-well piate (CoStar). 

The cells were counted and lnterleukin-2 ala-125 was replenished every 3 days. This was achieved by 
pooling all the wells into a centrifuge tube, counting the cells in a hemocytometer counting chamber, 
5 adjusting the cells to 1 X 10 6 /ml with CTL medium, and adding 2 U/ml of lnterleukin-2 ala-125. Then 1.5 X 
10 s cytotoxic T lymphocytes in 1.5 ml of CTL medium with lnterleukin-2 ala-125 were plated into each well 
of a 24-well plate (CoStar). the restimulation process was repeated every seven days, at which time frozen 
PBL's were then used as the source of stimulators. 

70 Example 32 

Binding of PEMa to the PE receptor 

PEMa was used in a binding/competition assay to compete with PE for the PE receptor on U-2 OS 
75 cells. In doing so, PEMa was shown in Figure 6 to protect the cells from the toxic effects of PE. Therefore, 
replacement of the toxin domain of PE with the Influenza matrix peptide (amino acids 57-68) did not prohibit 
the binding of this chimeric protein to the FE receptor. This suggests that the ability of PEMa to sensitize 
target cells for lysis by CTLs specific for the matrix peptide is mediated through PE receptor-mediated 
uptake and processing. 

20 U-2 cells were grown to a density of 20,000 cells/1 OOul in 960 well plates. Cells were preincubated with 

PEMA (0,0.1, 1,10 and 50 ug in 100 ul of complete McCoy's 5A medium) for 30 minutes at 37 °C, followed 
by incubation with or without PE(10 ng) for 2 minutes. This represents a 0-, 10-, 100-, 1000-, and 5000-fold 
excess of PEMA over PE, respectively. Cells were washed with McCoy's medium (3 x 200 ul), then 
incubated with [ 35 S]methionine (2 uCi/100 ul) for an additional 5 hours at 37 °C and washed (3 x 200 ul). 

25 Cells were lysed in 10mM EDTA (100 ul) and aliquots (5 ul) were spotted onto whatman 3MM filters. 
Incorporation of radioactivity was assayed by TCA precipitation of the cellular proteins onto the filter papers 
by immersion into ice-cold TCA (10% w/v) for at least 1 hour. Filters were washed once with 5% TCA and 3 
times with ethanol and dried. Radioactivity was determined by liquid scintillation counting. Incorporation of 
[ 35 S]methionine into the TCA-precipitable pool of cellular proteins in the absence (open circles) or presence 

30 (closed circles) of PE is shown as a function of log excess PEMa. Error bars represent +/-SEM for n = 9. 
Using a one-tailed t-test, incorporation of [ 35 S]methionine was determined to be significantly lower in the 
presence of PE than in the absence of FE at 0-, 10-, and 100-fold excesses of PEMa (99.5%, 99.5% and 
95% confidence limits, respectively). However, at 1000- and 5000-fold excesses of PEMa, incorporation was 
not significantly different in the presence or absence of PE. 

35 Following preparation of the protein hybrids of the present invention, a suspension of the protein- 

hybrids suitable for injection into the host animal must be prepared. Typical suspension vehicles include 
sterile saline and sterile water for injection. Various agents may be added as preservatives including 
benzethonium chloride (0.0025%), phenol (0.5%), thiomersal (1:10,000). Strength of the vaccine will be 
measured as mass of fusion protein which generates a protective response, defined by in vitro/in vivo 

40 results, per given host species, a method known to those of ordinary skill in the art. 

The suspensions for injection must, of course, be prepared under sterile conditions, in which there is a 
total absence of living organisms and absolute freedom from biological contamination present in the 
suspension for injection. 

Although water is always the solvent of choice for an injectable preparation, co-solvents that may be 
45 additionally present include ethyl alcohol, glycerin, propylene glycol, polyethylene glycol and 
dimethylacetamide. Buffers may be added, including acidic acid, citric acid or phosphoric acid systems. 
Antioxidants can include ascorbic acid, BHA, BHT, sodium bisulfite, and sodium metabisutfite. Tonicity can 
be adjusted with agents such as dextrose, sodium chloride and sodium sulfate. 

Aseptic manufacture of vaccines, including their packaging, is conducted according to methods well 
50 known to those of ordinary skill in the art, and as described in standard texts on the subject, including 
Lachman, L, et al., The Theory And Practice of Industrial Pharmacy , Dittert, L., ed, Sprowt's American 
Pharmacy; and Remington's Pharmaceutical Sciences. 

While the invention has been described and illustrated in reference to certain preferred embodiments 
thereof, those skilled in the art will appreciate that various changes, modifications and substitutions can be 
55 made therein without departing from the spirit and scope of the invention. It is intended, therefore, that the 
invention be limited only by the scope of the claims which follow, and that such claims be interpreted as 
broadly as is reasonable. 
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SEQUENCE LISTING 



70 



(1) GENERAL INFORMATION: 

<i) APPLICANT: Liu, Margaret 
Oliff, Allen 
Donnel 1 y , John 
Hawe, Linda 

15 Ultner, Jeffrey 

Shi , Xi ao-Pi ng 
Friedman, Arthur 
Montgomery, Donna 

20 

(ii) TITLE OF INVENTION: Cellular Immunity 
Vaccines From 

25 Bacterial Toxin-Antigen Conjugates 

(iii) NUMBER OF SEQUENCES: 58 

30 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Merck & Co., Inc. 

(B) STREET: 126 Lincoln Avenue 

35 

(C) CITY: Rahway 

(D) STATE: New Jersey 

(E) COUNTRY: U.S. 
4 0 (F) ZIP: 07065 



45 



50 



55 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-OOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, 
Versi on #1 .25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vi'ii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Grassier, Frank P. 

(B) REGISTRATION NUMBER: 31,164 

(C) REFERENCE/DOCKET NUMBER: 18475 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: (908)594-3462 
<B) TELEFAX: (908)594-4720 



55 
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(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1294 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
TCGCGATTGC AGTGGCACTG GCTGGTTTCG CTACCGTAGC GCAGGCCGCG AATTTGGCCG 
AAGAAGCTTT CGACCTCTGG AACGAATGCG CCAAAGCCTG CGTGCTCGAC CTCAAGGACG 
GCGTGCGTTC CAGCCGCATG AGCGTCGACC CGGCCATCGC CGACACCAAC GGCCAGGGCG 
TGCTGCACTA CTCCATGGTC CTGGAGGGCG GCAACGACGC GCTCAAGCTG GCCATCGACA 
ACGCCCTCAG CATCACCAGC GACGGCCTGA CCATCCGCCT CGAAGGCGGC GTCGAGCCGA 
ACAAGCCGGT GCGCTACAGC TACACGCGCC AGGCGCGCGG CAGTTGGTCG CTGAACTGGC 
TGGTACCGAT CGGCCACGAG AAGCCCTCGA ACATCAAGGT GTTCATCCAC GAACTGAACG 
CCGGCAACCA GCTCAGCCAC ATGTCGCCGA TCTACACCAT CGAGATGGGC GACGAGTTGC 
TGGCGAAGCT GGCGCGCGAT GCCACCTTCT TCGTCAGGGC GCACGAGAGC AACGAGATGC 
AGCCGACGCT CGCCATCAGC CATGCCGGGG TCAGCGTGGT CATGGCCCAG ACCCAGCCGC 
GCCGGGAAAA GCGCTGGAGC GAATGGGCCA GCGGCAAGGT GTTGTGCCTG CTCGACCCGC 
TGGACGGGGT CTACAACTAC CTCGCCCAGC AACGCTGCAA CCTCGACGAT ACCTGGGAAG 
GCAAGATCTA CCGG6TGCTC GCCGGCAACC CGGCGAAGCA TGACCTGGAC ATCAAACCCA 
CGGTCATCAG TCATCGCCTG CACTTTCCCG AGGGCGGCAG CCTGGCCGCG CTGACCGCGC 
ACCAGGCTTG CCACCTGCCG CTGGAGACTT TCACCCGTCA TCGCCAGCCG CGCGGCTGGG 
AACAACTGGA GCAGTGCGGC TATCCGGTGC AGCGGCTGGT CGCCCTCTAC CTGGCGGCGC 
GGCTGTCGTG GAACCAGGTC GACCAGGTGA TCCGCAACGC CCTGGCCAGC CCCGGCAGCG 
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GCGGCGACCT GGGCGAAGCG ATCCGCGAGC AGCCGGAGCA GGCCCGTCTG GCCCTGACCC 1080 

TGGCCGCCGC CGAGAGCGAG CGCTTCGTCC GGCAGGGCAC CGGCAACGAC GAGGCCGGCG 1140 

CGGCCAACGC CGACGTGGTG AGCCTGACCT GCCCGGTCGC CGCCGGTGAA TGCGCGGGCC 1200 

CGGCGGACAG CGGCGACGCC CTGCTGGAGC GCAACTATCC CACTGGCGCG GAGTTCCTCG 1260 

GCGACGGCGG CGACGTCAGC TTCAGCACCC GCGG 1294 
(2) INFORMATION FOR SEQ ID N0:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 759 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

ATGAGTCTTC TAACCGAGGT CGAAACGTAC GTTCTCTCTA TCATCCCGTC AGGCCCCCTC 60 

AAAGCCGAGA TCGCACAGAG ACTTGAAGAT GTCTTTGCAG GGAA6AACAC CGATCTTGAG 120 

30 GTTCTCATGG AATGGCTAAA GACAAGACCA ATCCTGTCAC CTCTGACTAA GGGGATTTTA 180 

GGATTTGTGT TCACGCTCAC CGTGCCCAGT GAGCGAGGAC TGCAGCGTAG ACGCTTTGTC 240 

CAAAATGCCC TTAATGGGAA CGGGGATCCA AATAACATGG ACAAAGCAGT TAAACTGTAT 300 

35 

AGGAAGCTCA AGAGGGAGAT AACATTCCAT GGGGCCAAAG AAATCTCACT CAGTTATTCT 360 

GCTGGTGCAC TTGCCAGTTG TATGGGCCTC ATATACAACA GGATGGGGGC TGTGACCACT 420 

40 GAAGTGGCAT TTGGCCTGGT ATGTGCAACC TGTGAACAGA TTGCTGACTC CCAGCATCGG 480 

TCTCATAGGC AAATGGTGAC AACAACCAAC CCACTAATCA GACATGAGAA CAGAATGGTT 540 

TTAGCCAGCA CTACAGCTAA GGCTATGGAG CAAATGGCTG GATCGAGTGA GCAAGCAGCA 600 

GAGGCCATGG AGGTTGCTAG TCAGGCTAGG CAAATGGTGC AAGCGATGAG AACCATTGGG 660 
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ACTCATCCTA GCTCCAGTGC TGGTCTGAAA AATGATCTTC TTGAAAATTT GCAGGCCTAT 720 
CAGAAACGAA TGGGGGTGCA GATGCAACGG TTCAAGTGA 759 
(2) INFORMATION FOR SEQ ID N0:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 253 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

Met Ser Leu Leu Thr Glu Val Gl u Thr Tyr Val Leu Ser lie lie Pro 
15 10 15 

Ser Gly Pro Leu Lys Ala Glu lie Ala Gin Arg Leu Glu Asp Val Phe 
20 25 30 

Ala Gly Lys Asn Thr Asp Leu Glu Val Leu Met Glu Trp Leu Lys Thr 
35 40 45 

Arg Pro He Leu Ser Pro Leu Thr Lys Gly lie Leu Gly Phe Val Phe 
50 55 60 

Thr Leu Thr Val Pro Ser Glu Arg Gly Leu Gin Arg Arg Arg Phe Val 
65 70 75 80 

Gin Asn Ala Leu Asn Gly Asn Gly Asp Pro Asn Asn Met Asp Lys Ala 
85 90 95 

Val Lys Leu Tyr Arg Lys Leu Lys Arg Glu He Thr Phe His Gly Ala 
100 105 110 

Lys Glu He Ser Leu Ser Tyr Ser Ala Gly Ala Leu Ala Ser Cys Met 
115 120 125 

Gly Leu He Tyr Asn Arg Met Gly Ala Val Thr Thr Glu Val Ala Phe 
130 135 140 

Gly Leu Val Cys Ala Thr Cys Glu Gin He Ala Asp Ser Gin His Arg 
145 150 155 160 
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Ser His Arg Gin Met Val Thr Thr Thr Asn Pro Leu lie Arg His Glu 
165 "0 175 

Asn Arg Met Val Leu Ala Ser Thr Thr Ala Lys Ala Met Glu Gin Met 
180 185 190 

Ala Gly Ser Ser Glu Gin Ala Ala Glu Ala Met Glu Val Ala Ser Gin 
195 200 205 

Ala Arg Gin Met Val Gin Ala Met Arg Thr lie Gly Thr His Pro Ser 
210 215 220 

Ser Ser Ala Gly Leu Lys Asn Asp Leu Leu Glu Asn Leu Gin Ala Tyr 
225 230 235 240 

Gin Lys Arg Met Gly Val Gin Met Gin Arg Phe Lys Xaa 
245 250 

(2) INFORMATION FOR SEQ 10 N0;4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:4: 
ATACCCGCGG CAGTCTTCTA ACCGAGGTCG 
35 (2) INFORMATION FOR SEQ 10 N0:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 
CCCCACGTCT ACGTTGCCAA GTTCACTCTC GAGATA 



55 



21 




EP 0 532 090 A2 



(2) INFORMATION FOR SEQ ID N0:6: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:6: 

75 CTCGAGAATT CATGGCCGAG GAAGCTT 27 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 1998 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

3Q (xi) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

ATGGCCGAAG AAGCTTTCGA CCTCTGGAAC GAATGCGCCA AAGCCTGCGT GCTCGACCTC 60 

AAGGACGGCG TGCGTTCCAG CCGCATGAGC GTCGACCCGG CCATCGCCGA CACCAACGGC 120 

35 CAGGGCGTGC TGCACTACTC CATGGTCCTG GAGGGCGGCA ACGACGCGCT CAAGCTGGCC 180 

ATCGACAACG CCCTCAGCAT CACCAGCGAC GGCCTGACCA TCCGCCTCGA AGGCGGCGTC 240 

GAGCCGAACA AGCCGGTGCG CTACAGCTAC ACGCGCCAGG CGCGCGGCAG TTGGTCGCTG 300 

40 

AACTGGCTGG TACCGATCGG CCACGAGAAG CCCTCGAACA TCAAGGTGTT CATCCACGAA 360 

CTGAACGCCG GCAACCAGCT CAGCCACATG TCGCCGATCT ACACCATCGA GATGGGCGAC 420 

45 GAGTTGCTGG CGAAGCTGGC GCGCGATGCC ACCTTCTTCG TCAGGGCGCA CGAGAGCAAC 480 

GAGATGCAGC CGACGCTCGC CATCAGCCAT GCCGGGGTCA GCGTGGTCAT GGCCCAGACC 540 



55 



22 



EP 0 532 090 A2 



CAGCCGCGCC GGGAAAAGCG CTGGAGCGAA TGGGCCAGCG GCAAGGTGTT GTGCCTGCTC 
GACCCGCTGG ACGGGGTCTA CAACTACCTC GCCCAGCAAC GCTGCAACCT CGACGATACC 
TGGGAAGGCA AGATCTACCG GGTGCTCGCC GGCAACCCGG CGAAGCATGA CCTGGACATC 
AAACCCACGG TCATCAGTCA TCGCCTGCAC TTTCCCGAGG GCGGCAGCCT GGCCGCGCTG 
ACCGCGCACC AGGCTTGCCA CCTGCCGCTG GAGACTTTCA CCCGTCATCG CCAGCCGCGC 
GGCTGGGAAC AACTGGAGCA GTGCGGCTAT CCGGTGCAGC GGCTGGTCGC CCTCTACCTG 
GCGGCGCGGC TGTCGTGGAA CCAGGTCGAC CAGGTGATCC GCAACGCCCT GGCCAGCCCC 
GGCAGCGGCG GCGACCTGGG CGAAGCGATC CGCGAGCAGC CGGAGCAGGC CCGTCTGGCC 
CTGACCCTGG CCGCCGCCGA GAGCGAGCGC TTCGTCCGGC AGGGCACCGG CAACGACGAG 
GCCGGCGCGG CCAACGCCGA CGTGGTGAGC CTGACCTGCC CGGTCGCCGC CGGTGAATGC 
GCGGGCCCGG CGGACAGCGG CGACGCCCTG CTGGAGCGCA ACTATCCCAC TGGCGCGGAG 
TTCCTCGGCG ACGGCGGCGA CGTCAGCTTC AGCACCCGCG GCAGTCTTCT AACCGAGGTC 
GAAACGTACG TTCTCTCTAT CATCCCGTCA GGCCCCCTCA AAGCCGAGAT CGCACAGAGA 
CTTGAAGATG TCTTTGCAGG GAAGAACACC GATCTTGAGG TTCTCATGGA ATGGCTAAAG 
ACAAGACCAA TCCTGTCACC TCTGACTAAG GGGATTTTAG GATTTGTGTT CACGCTCACC 
GTGCCCAGTG AGCGAGGACT GCAGCGTAGA CGCTTTGTCC AAAATGCCCT TAATGGGAAC 
GGGGATCCAA ATAACATGGA CAAAGCAGTT AAACTGTATA GGAAGCTCAA GAGGGAGATA 
ACATTCCATG GGGCCAAAGA AATCTCACTC AGTTATTCTG CTGGTGCACT TGCCAGTTGT 
ATGGGCCTCA TATACAACAG GATGGGGGCT GTGACCACTG AAGTGGCATT TGGCCTGGTA 
TGTGCAACCT GTGAACAGAT TGCTGACTCC CAGCATCGGT CTCATAGGCA AATGGTGACA 
ACAACCAACC CACTAATCAG ACATGAGAAC AGAATGGTTT TAGCCAGCAC TACAGCTAAG 
GCTATGGAGC AAATGGCTGG ATCGAGTGAG CAAGCAGCAG AGGCCATGGA GGTTGCTAGT 
CAGGCTAGGC AAATGGTGCA AGCGATGAGA ACCATTGGGA CTCATCCTAG CTCCAGTGCT 
GGTCTGAAAA ATGATCTTCT TGAAAATTTG CAGGCCTATC AGAAACGAAT GGGGGTGCAG 
ATGCAACGGT TCAAGTGA 
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(2) INFORMATION FOR SEQ ID N0:8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 666 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

Met Ala Glu Glu Ala Phe Asp Leu Trp Asn Glu Cys Ala Lys Ala Cys 
15 10 15 

Val Leu Asp Leu Lys Asp Gly Val Arg Ser Ser Arg Met Ser Val Asp 
20 25 30 

Pro Ala lie Ala Asp Thr Asn Gly Gin Gly Val Leu His Tyr Ser Met 
35 40 45 

Val Leu Glu Gly Gly Asn Asp Ala Leu Lys Leu Ala lie Asp Asn Ala 
50 55 60 

Leu Ser lie Thr Ser Asp Gly Leu Thr lie Arg Leu Glu Gly Gly Val 
65 70 75 80 

Glu Pro Asn Lys Pro Val Arg Tyr Ser Tyr Thr Arg Gin Ala Arg Gly 
85 90 95 

Ser Trp Ser Leu Asn Trp Leu Val Pro He Gly His Glu Lys Pro Ser 
100 105 no 

Asn He Lys Val Phe He His Glu Leu Asn Ala Gly Asn Gin Leu Ser 
115 120 125 

His Met Ser Pro He Tyr Thr lie Glu Met Gly Asp Glu Leu Leu Ala 
130 135 140 

Lys Leu Ala Arg -Asp Ala Thr Phe Phe Val Arg Ala His Glu Ser Asn 
145 150 155 160 

Glu Met Gin Pro Thr Leu Ala He Ser His Ala Gly Val Ser Val Val 
165 170 175 

Met Ala Gin Thr Gin Pro Arg Arg Glu Lys Arg Trp Ser Glu Trp Ala 
180 185 190 
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Ser Gly Lys Val Leu Cys 



Leu Leu Asp Pro Leu Asp Gly Val Tyr Asn 



195 



200 



205 



,r Leo Ala 61. Gin Arg Cys Asn Leu Asp Asp Thr Trp Glu 61, Lys 



210 



215 



220 



He Tyr Arg Val Leu 

225 



Ala Gly Asn Pro Ala Lys His Asp Leu Asp He 



230 



235 



240 



Lys Pro Th 



r Val He Ser His Arg Leu His Phe Pro Glu Gly Gly Ser 



245 



250 



255 



Leu Ala Ala Leu Thr Ala His Gin Ala Cys His Leu Pro Leu Glu Thr 
260 265 
r Arg His Arg Gin Pro Arg Gly Trp Glu Gin Leu Glu Gin Cys 



Phe Thr Arg 

275 



280 



285 



Gly Tyr Pro Val Gin Arg Leu Val Ala Leu Tyr Leu Ala Ala Arg Leu 
J 1 300 



290 



295 



Ser Trp Asn Gin Val Asp Gin Val He Arg Asn Ala Leu Ala Ser Pro 



305 



310 



315 



Gly Ser Gly Gly Asp Leu Gly Glu Ala He Arg Glu Gin Pro Glu Gin 
325 330 

Ala Arg Leu Ala Leu Thr Leo Ala Ala Ala Glo Ser Glu Arg Phe V.l 
340 345 350 

Arg Gin Gly Thr Gly Asn Asp Glo Ala Gly Ala Ala Asn A! a Asp Val 
355 360 365 

val Ser Leu Thr Cys Pro Val Ala Ala Gly Glu Cys Ala Gly Pro Ala 



380 



Asp Se 
385 



370 375 

r Gly Asp Ala Leu Leu Glo Arg Asn Tyr Pro Thr Gly Ala Glo 



390 



395 



Phe Leo Gly Asp Gly Gly Asp Val Ser Phe Ser Thr Arg Gly Ser Leo 
405 410 

Leu Thr Glo Val Glu" Thr Tyr Val Leu Ser He He Pro Ser Gly Pro 

420 "25 430 

Leo Lys Ala Glu He Ala Gin Arg Leu Glu Asp Val Phe Ala Gly Lys 
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Asn Thr Asp Leu Glu Val Leu Met Glu Trp Leu Lys Thr Arg Pro He 
450 455 460 

Leu Ser Pro Leu Thr Lys Gly lie Leu Gly Phe Val Phe Thr Leu Thr 
465 470 475 480 

Val Pro Ser Glu Arg Gly Leu Gin Arg Arg Arg Phe Val Gin Asn Ala 
485 490 495 

Leu Asn Gly Asn Gly Asp Pro Asn Asn Met Asp Lys Ala Val Lys Leu 
500 505 510 

Tyr Arg Lys Leu Lys Arg Glu He Thr Phe His Gly Ala Lys Glu He 
515 520 525 

Ser Leu Ser Tyr Ser Ala Gly Ala Leu Ala Ser Cys Met Gly Leu He 
530 535 540 

Tyr Asn Arg Met Gly Ala Val Thr Thr Glu Val Ala Phe Gly Leu Val 
545 550 555 560 

Cys Ala Thr Cys Glu Gin He Ala Asp Ser Gin His Arg Ser His Arg 
565 570 575 

Gin Met Val Thr Thr Thr Asn Pro Leu lie Arg His Glu Asn Arg Met 
580 585 590 

Val Leu Ala Ser Thr Thr Ala Lys Ala Met Glu Gin Met Ala Gly Ser 
595 600 605 

Ser Glu Gin Ala Ala Glu Ala Met Glu Val Ala Ser Gin Ala Arg Gin 
610 615 620 

Met Val Gin Ala Met Arg Thr He Gly Thr His Pro Ser Ser Ser Ala 
62 5 630 635 640 

Gly Leu Lys Asn Asp Leu Leu Glu Asn Leu Gin Ala Tyr Gin Lys Arg 
645 650 655 

Met Gly Val Gin Met Gin Arg Phe Lys Xaa 
660 665 

(2) INFORMATION FOR SEQ ID N0:9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



26 



EP 0 532 090 A2 



( i i ) MOLECULE TYPE: DNA (genomic) , 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 
CTAGAAATAA TTTTGTTTAA CTTTAAGAAG GAGATATACA TATGGCCGAA GA 
10 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 
75 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

20 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 10: 
ATACCCGCGG CAAGGGGATT TTAGGATTTG TG 

25 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 36 base pairs 
3Q (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
ATAGAGCTCT CACACGGTGA GCGTGAACAC AAATCC 

40 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 52 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 
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(ii) MOLECULE TYPE: DNA (genomic) 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CCGCGGCAAG GGGATTTTAG GATTTGTGTT CACGCTCACC GTGTGAGAGC TC 52 
70 (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 
75 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CTAGAAATAA TTTTGTTTAA CTTTAAGAAG GAGATATACA TATGGCCGAA GA 52 
(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1281 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

40 ATGGCCGAGG AAGCTTTCGA CCTCTGGAAC GAATGCGCCA AAGCCTGCGT GCTCGACCTC 60 

AAGGACGGCG TGCGTTCCAG CCGCATGAGC GTCGACCCGG CCATCGCCGA CACCAACGGC 120 

CAGGGCGTGC TGCACTACTC CATGGTCCTG GAGGGCGGCA ACGACGCGCT CAAGCTGGCC 180 

ATCGACAACG CCCTCAGCAT CACCAGCGAC GGCCTGACCA TCCGCCTCGA AGGCGGCGTC 240 

GAGCCGAACA AGCCGGTGCG CTACAGCTAC ACGCGCCAGG CGCGCGGCAG TTGGTCGCTG 300 
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AACTGGCTGG TACCGATCGG CCACGAGAAG CCCTCGAACA TCAAGGTGTT CATCCACGAA 360 

CTGAACGCCG GCAACCAGCT CAGCCACATG TCGCCGATCT ACACCATCGA GATGGGCGAC 420 

GAGTTGCTGG CGAAGCTGGC GCGCGATGCC ACCTTCTTCG TCAGGGCGCA CGA6AGCAAC 480 

GAGATGCAGC CGACGCTCGC CATCAGCCAT GCCGGGGTCA GCGTGGTCAT GGCCCAGACC 540 

CAGCCGCGCC GGGAAAAGCG CTGGAGCGAA TGGGCCAGCG GCAAGGTGTT GTGCCTGCTC 600 

GACCCGCTGG ACGGGGTCTA CAACTACCTC GCCCAGCAAC GCTGCAACCT C6ACGATACC 660 

TGGGAAGGCA AGATCTACCG GGTGCTCGCC GGCAACCCGG CGAAGCATGA CCTGGACATC 720 

75 AAACCCACGG TCATCAGTCA TCGCCTGCAC TTTCCCGAGG GCGGCAGCCT GGCCGCGCTG 780 

ACCGCGCACC AGGCTTGCCA CCTGCCGCTG GAGACTTTCA CCCGTCATCG CCAGCCGCGC 840 

GGCTGGGAAC AACTGGAGCA GTGCGGCTAT CCGGTGCAGC GGCTGGTCGC CCTCTACCTG 900 

GCGGCGCGGC TGTCGTGGAA CCAGGTCGAC CAGGTGATCC GCAACGCCCT GGCCAGCCCC 960 

GGCAGCGGCG GCGACCTGGG CGAAGCGATC CGCGAGCAGC CGGAGCAGGC CCGTCTGGCC 1020 

CTGACCCTGG CCGCCGCCGA GAGCGAGCGC TTCGTCCGGC AGGGCACCGG CAACGACGAG 1080 

GCCGGCGCGG CCAACGCCGA CGTGGTGAGC CTGACCTGCC CGGTCGCCGC CGGTGAATGC 1140 

GCGGGCCCGG CGGACAGCGG CGACGCCCTG CTGGAGCGCA ACTATCCCAC TGGCGCGGAG 1200 

TTCCTCGGCG ACGGCGGCGA CGTCAGCTTC AGCACCCGCG GCAAGGGGAT TTTAGGATTT 1260 

GTGTTCACGC TCACCGTGTG A 1281 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 427 amino acids 
(8) TYPE: amino acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Ala Glu Glu Ala Phe Asp Leu Trp Asn Glu Cys Ala Lys Ala Cys 
15 10 15 

Val Leu Asp Leu Lys Asp Gly Val Arg Ser Ser Arg Met Ser Val Asp 
20 25 30 

Pro Ala lie Ala Asp Thr Asn Gly Gin Gly Val Leu His Tyr Ser Met 
35 40 45 

Val Leu Glu Gly Gly Asn Asp Ala Leu Lys Leu Ala lie Asp Asn Ala 
50 55 60 

Leu Ser lie Thr Ser Asp Gly Leu Thr lie Arg Leu Glu Gly Gly Val 
65 70 75 80 

Glu Pro Asn Lys Pro Val Arg Tyr Ser Tyr Thr Arg Gin Ala Arg Gly 
85 90 95 

Ser Trp Ser Leu Asn Trp Leu Val Pro lie Gly His Glu Lys Pro Ser 
100 105 110 

Asn lie Lys Val Phe He His Glu Leu Asn Ala Gly Asn Gin Leu Ser 
115 120 125 

His Met Ser Pro He Tyr Thr He Glu Met Gly Asp Glu Leu Leu Ala 
130 135 140 

Lys Leu Ala Arg Asp Ala Thr Phe Phe Val Arg Ala His Glu Ser Asn 
145 150 155 160 

Glu Met Gin Pro Thr Leu Ala He Ser His Ala Gly Val Ser Val Val 
165 170 175 

Met Ala Gin Thr Gin Pro Arg Arg Glu Lys Arg Trp Ser Glu Trp Ala 
180 185 190 

Ser Gly Lys Val Leu Cys Leu Leu Asp Pro Leu Asp Gly Val Tyr Asn 
195 200 205 

Tyr Leu Ala Gin Gl^n Arg Cys Asn Leu Asp Asp Thr Trp Glu Gly Lys 
210 215 220 

He Tyr Arg Val Leu Ala Gly Asn Pro Ala Lys His Asp Leu Asp He 
225 230 235 240 

Lys Pro Thr Val He Ser His Arg Leu His Phe Pro Glu Gly Gly Ser 
245 250 255 
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Leu Ala Ala Leu Thr Ala His Gin Ala Cys His Leu Pro Leu Glu Thr 
260 265 270 

Phe Thr Arg His Arg Gin Pro Arg Gly Trp Glu Gin Leu Glu Gin Cys 
275 280 285 

Gly Tyr Pro Val Gin Arg Leu Val Ala Leu Tyr Leu Ala Ala Arg Leu 
290 295 300 

Ser Trp Asn Gin Val Asp Gin Val lie Arg Asn Ala Leu Ala Ser Pro 
305 310 315 320 

Gly Ser Gly Gly Asp Leu Gly Glu Ala He Arg Glu Gin Pro Glu Gin 
325 330 335 

Ala Arg Leu Ala Leu Thr Leu Ala Ala Ala Glu Ser Glu Arg Phe Val 
340 345 350 

Arg Gin Gly Thr Gly Asn Asp Glu Ala Gly Ala Ala Asn Ala Asp Val 

355 360 365 

Val Ser Leu Thr Cys Pro Val Ala Ala Gly Glu Cys Ala Gly Pro Ala 
370 375 380 

Asp Ser Gly Asp Ala Leu Leu Glu Arg Asn Tyr Pro Thr Gly Ala Glu 
385 390 395 400 

Phe. Leu Gly Asp Gly Gly Asp Val Ser Phe Ser Thr Arg Gly Lys Gly 
405 410 415 



lie Leu Gly Phe Val Phe Thr Leu Thr Val Xaa 
420 425 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE :— DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GGCTGATAAT AGAGCTCG 
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(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1245 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

ATGGCCGAGG AAGCTTTCGA CCTCTGGAAC GAATGCGCCA AAGCCTGCGT GCTCGACCTC 60 

AAGGACGGCG TGCGTTCCAG CCGCATGAGC GTCGACCCGG CCATCGCCGA CACCAACGGC 120 

CAGGGCGTGC TGCACTACTC CATGGTCCTG GAGGGCGGCA ACGACGCGCT CAAGCTGGCC 180 

ATCGACAACG CCCTCAGCAT CACCAGCGAC GGCCTGACCA TCCGCCTCGA AGGCGGCGTC 240 

GAGCCGAACA AGCCGGTGCG CTACAGCTAC ACGCGCCAGG CGCGCGGCAG TTGGTCGCTG 300 

AACTGGCTGG TACCGATCGG CCACGAGAAG CCCTCGAACA TCAAGGTGTT CATCCACGAA 360 

CTGAACGCCG GCAACCAGCT CAGCCACATG TCGCCGATCT ACACCATCGA GATGGGCGAC 420 

GAGTTGCTGG CGAAGCTGGC GCGCGATGCC ACCTTCTTCG TCAGGGCGCA CGAGAGCAAC 480 

GAGATGCAGC CGACGCTCGC CATCAGCCAT GCCGGGGTCA GCGTGGTCAT GGCCCAGACC 540 

CAGCCGCGCC GGGAAAAGCG CTGGAGCGAA TGGGCCAGCG GCAAGGTGTT GTGCCTGCTC 600 

GACCCGCTGG ACGGGGTCTA CAACTACCTC GCCCAGCAAC GCTGCAACCT CGACGATACC 660 

TGGGAAGGCA AGATCTACCG GGTGCTCGCC GGCAACCCGG CGAAGCATGA CCTGGACATC 720 

AAACCCACGG TCATCAGTCA TCGCCTGCAC TTTCCCGAGG GCGGCAGCCT GGCCGCGCTG 780 

ACCGCGCACC AGGCTTGCCA CCTGCCGCTG GAGACTTTCA CCCGTCATCG CCAGCCGCGC 840 

GGCTGGGAAC AACTGGAGCA GTGCGGCTAT CCGGTGCAGC GGCTGGTCGC CCTCTACCTG 900 

GCGGCGCGGC TGTCGTGGAA CCAGGTCGAC CAGGTGATCC GCAACGCCCT GGCCAGCCCC 960 

GGCAGCGGCG GCGACCTGGG CGAAGCGATC CGCGAGCAGC CGGAGCAGGC CCGTCTGGCC 1020 
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CTGACCCTGG CCGCCGCCGA GAGCGAGCGC TTCGTCCGGC AGGGCACCGG CAACGACGAG 1080 

GCCGGCGCGG CCAACGCCGA CGTGGTGAGC CTGACCTGCC CGGTCGCCGC CGGTGAATGC H40 

5 

GCGGGCCCGG CGGACAGCGG CGACGCCCTG CTGGAGCGCA ACTATCCCAC TGGCGCGGAG 1200 

TTCCTCGGCG ACGGCGGCGA CGTCAGCTTC AGCACCCGCG GCTGA 1245 
10 (2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 415 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Ala Gl u Glu Ala Phe Asp Leu Trp Asn Glu Cys Ala Lys Ala Cys 
25 1 5 10 15 

Val Leu Asp Leu Lys Asp Gl y Val Arg Ser Ser Arg Met Ser Val Asp 
20 25 30 



30 



35 



40 



45 



Pro Ala lie Ala Asp Thr Asn Gly Gin Gly Val Leu His Tyr Ser Met 
35 40 45 

Val Leu Glu Gly Gly Asn Asp Ala Leu Lys Leu Ala He Asp Asn Ala 
50 55 60 

Leu Ser He Thr Ser Asp Gly Leu Thr He Arg Leu Glu Gly Gly Val 
65 70 75 80 

Glu Pro Asn Lys Pro Val Arg Tyr Ser Tyr Thr Arg Gin Ala Arg Gly 
85 90 95 

Ser Trp Ser Leu Asn Trp Leu Val Pro lie Gly His Glu Lys Pro Ser 
100 105 HO 

Asn He Lys Val Phe He His Glu Leu Asn Ala Gly Asn Gin Leu Ser 
115 120 125 

His Met Ser Pro He Tyr Thr He Glu Met Gly Asp Glu Leu Leu Ala 
130 135 140 



50 



55 
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Lys Leu Ala Arg Asp Ala Thr Phe Phe Val Arg Ala His Glu Ser Asn 
145 150 155 160 

Glu Met Gin Pro Thr Leu Ala He Ser His Ala Gly Val Ser Val Val 
165 170 175 

Met Ala Gin Thr Gin Pro Arg Arg Glu Lys Arg Trp Ser Glu Trp Ala 
180 185 190 

Ser Gly Lys Val Leu Cys Leu Leu Asp Pro Leu Asp Gly Val Tyr Asn 
195 200 205 

Tyr Leu Ala Gin Gin Arg Cys Asn Leu Asp Asp Thr Trp Glu Gly Lys 
210 215 220 

He Tyr Arg Val Leu Ala Gly Asn Pro Ala Lys His Asp Leu Asp He 
225 230 235 240 

Lys Pro Thr Val lie Ser His Arg Leu His Phe Pro Glu Gly Gly Ser 
245 250 255 

Leu Ala Ala Leu Thr Ala His Gin Ala Cys His Leu Pro Leu Glu Thr 
260 265 270 

Phe Thr Arg His Arg Gin Pro Arg Gly Trp Glu Gin Leu Glu Gin Cys 
275 280 285 

Gly Tyr Pro Val Gin Arg Leu Val Ala Leu Tyr Leu Ala Ala Arg Leu 
290 295 300 

Ser Trp Asn Gin Val Asp Gin Val He Arg Asn Ala Leu Ala Ser Pro 
305' 310 315 320 

Gly Ser Gly Gly Asp Leu Gly Glu Ala He Arg Glu Gin Pro Glu Gin 
325 330 335 

Ala Arg Leu Ala Leu Thr Leu Ala Ala Ala Glu Ser Glu Arg Phe Val 
340 345 350 

Arg Gin Gly Thr Gly Asn Asp Glu Ala Gly Ala Ala Asn Ala Asp Val 
355 _ 360 .365 

Val Ser Leu Thr Cys Pro Val Ala Ala Gly Glu Cys Ala Gly Pro Ala 
370 375 380 



Asp Ser Gly Asp Ala Leu Leu Glu Arg Asn Tyr Pro Thr Gly Ala Glu 
385 390 395 400 
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Phe Leu Gly Asp Gly Gly Asp Val Ser Phe Ser Thr 
405 410 

(2) INFORMATION FOR SEQ 10 NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES5: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TCGAGCCGCC ACCATGGCCG AGGAA 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 46 base pairs 
(8) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:20: 
GACCCGCTAG CACCCGGGAA ACCGCCGCGC GAGGACCTGA AGTAAG 
(2) INFORMATION FOR SEQ ID N0:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1956 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY :Hi near 

<ii) MOLECULE TYPE : DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:21: 

ATGCACCTGA TACCCCATTG GATCCCCCTG GTCGCCAGCC TCGGCCTGCT CGCCGGCGGC 60 

TCGTCCGCGT CCGCCGCCGA GGAAGCTTTC GACCTCTGGA ACGAATGCGC CAAAGCCTGC 120 

GTGCTCGACC TCAAGGACGG CGTGCGTTCC AGCCGCATGA GCGTCGACCC GGCCATCGCC 180 

GACACCAACG GCCAGGGCGT GCTGCACTAC TCCATGGTCC TGGAGGGCGG CAACGACGCG 240 

CTCAAGCTGG CCATCGACAA CGCCCTCAGC ATCACCAGCG ACGGCCTGAC CATCCGCCTC 300 

GAAGGCGGCG TCGAGCCGAA CAAGCCGGTG CGCTACAGCT ACACGCGCCA GGCGCGCGGC 360 

AGTTGGTCGC TGAACTGGCT GGTACC6ATC GGCCACGAGA AGCCCTCGAA CATCAAGGTG 420 

TTCATCCACG AACTGAACGC CGGCAACCAG CTCAGCCACA TGTCGCCGAT CTACACCATC 480 

GAGATGGGCG ACGAGTTGCT GGCGAAGCTG GCGCGCGATG CCACCTTCTT CGTCAGGGCG 540 

CACGAGAGCA ACGAGATGCA GCCGACGCTC GCCATCAGCC ATGCCGGGGT CAGCGTGGTC 600 

ATGGCCCAGA CCCAGCCGCG CCGGGAAAAG CGCTGGAGCG AATGGGCCAG CGGCAAGGTG 660 

TTGTGCCTGC TCGACCCGCT GGACGGGGTC TACAACTACC TCGCCCAGCA ACGCTGCAAC 720 

CTCGACGATA CCTGGGAAGG CAAGATCTAC CGGGTGCTCG CCGGCAACCC GGCGAAGCAT 780 

GACCTGGACA TCAAACCCAC GGTCATCAGT CATCGCCTGC ACTTTCCCGA GGGCGGCAGC 840 

CTGGCCGCGC TGACCGCGCA CCAGGCTTGC CACCTGCCGC TGGAGACTTT CACCCGTCAT 900 

CGCCAGCCGC GCGGCTGGGA ACAACTGGAG CAGTGCGGCT ATCCGGTGCA GCGGCTGGTC 960 

GCCCTCTACC TGGCGGCGCG GCTGTCGTGG AACCAGGTCG ACCAGGTGAT CCGCAACGCC 1020 

CTGGCCAGCC CCGGCAGCGG CGGCGACCTG GGCGAAGCGA TCCGCGAGCA GCCGGAGCAG 1080 

GCCCGTCTGG CCCTGACCCT GGCCGCCGCC GAGAGCGAGC GCTTCGTCCG GCAGGGCACC 1140 

GGCAACGACG AGGCCGGCGC GGCCAACGCC GACGTGGTGA GCCTGACCTG CCCGGTCGCC 1200 

GCCGGTGAAT GCGCGGGCCC GGCGGACAGC GGCGACGCCC TGCTGGAGCG CAACTATCCC 1260 

ACTGGCGCGG AGTTCCTCGG CGACGGCGGC GACGTCAGCT TCAGCACCCG CGGCACGCAG 1320 

AACTGGACGG TGGAGCGGCT GCTCCAGGCG CACCGCCAAC TGGAGGAGCG CGGCTATGTG 1380 
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TTCGTCGGCT ACCACGGCAC CTTCCTCGAA GCGGCGCAAA GCATCGTCTT CGGCGGGGTG 1440 

CGCGCGCGCA GCCAGGACCT CGACGCGATC TGGCGCGGTT TCTATATCGC CGGCGATCCG 1500 

5 

GCGCTGGCCT ACGGCTACGC CCAGGACCAG GAACCCGACG CACGCGGCCG GATCCGCAAC 1560 

GGTGCCCTGC TGCGGGTCTA TGTGCCGCGC TCGAGCCTGC CGGGCTTCTA CCGCACCAGC 1620 

'0 CTGACCCTGG CCGCGCCGGA GGCGGCGGGC GAGGTCGAAC GGCTGATCGG CCATCCGCTG 1680 

CCGCTGCGCC TGGACGCCAT CACCGGCCCC GAGGAGGAAG GCGGGCGCCT GGAGACCATT 1740 

CTCGGCTGGC CGCTGGCCGA GCGCACCGTG GTGATTCCCT CGGCGATCCC CACCGACCCG 1800 

75 

CGCAACGTCG GCGGCGACCT CGACCCGTCC AGCATCCCCG ACAAGGAACA GGCGATCAGC 1860 

GCCCTGCCGG ACTACGCCAG CCAGCCCGGC AAACCGCCGC GCGAGGACCC GCTAGCACCC 1920 

20 GGGAAACCGC CGCGCGAGGA CCTGAAGTAA GAATTC 1956 

(2) INFORMATION FOR SEQ 10 N0:22: 

<i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 

(C) STRANOEONESS: single 
(0) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:22: 

35 

Met His Leu He Pro His Trp He Pro Leu Val Ala Ser Leu Gly Leu 
1 5 10 15 

Leu Ala Gly Gly Ser Ser Ala Ser Ala Ala Glu Glu Ala Phe Asp Leu 
40 20 25 30 

Trp Asn Glu Cys Ala Lys_Ala Cys Val Leu Asp Leu Lys Asp Gly Val 
35 40 45 

45 Arg Ser Ser Arg Met Ser Val Asp Pro Ala He Ala Asp Thr Asn Gly 

50 55 60 

Gin Gly Val Leu His Tyr Ser Met Val Leu Glu Gly Gly Asn Asp Ala 
65 70 ' 75 80 

50 



55 
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Leu Lys Leu Ala lie Asp Asn Ala Leu Ser lie Thr Ser Asp Gly Leu 
85 90 95 

Thr He Arg Leu Glu Gly Gly Val Glu Pro Asn Lys Pro Val Arg Tyr 
100 105 110 

Ser Tyr Thr Arg Gin Ala Arg Gly Ser Trp Ser Leu Asn Trp Leu Val 
115 120 125 



w 



Pro He Gly His Glu Lys Pro Ser Asn He Lys Val Phe He His Glu 
130 135 140 



75 



Leu Asn Ala Gly Asn Gin Leu Ser His Met Ser Pro He Tyr Thr lie 

145 150 155 160 

Glu Met Gly Asp Glu Leu Leu Ala Lys Leu Ala Arg Asp Ala Thr Phe 

165 170 175 



20 



Phe Val Arg Ala His Glu Ser Asn Glu Met Gin Pro Thr Leu Ala He 
180 185 190 



Ser His Ala Gly Val Ser Val Val Met Ala Gin Thr Gin Pro Arg Arg 
195 200 205 



25 



Glu Lys Arg Trp Ser Glu Trp Ala Ser Gly Lys Val Leu Cys Leu Leu 
210 215 220 



30 



Asp Pro Leu Asp Gly Val Tyr Asn Tyr Leu Ala Gin Gin Arg Cys Asn 

225 230 235 240 

Leu Asp Asp Thr Trp Glu Gly Lys lie Tyr Arg Val Leu Ala Gly Asn 

245 250 255 



35 



Pro Ala Lys His Asp Leu Asp He Lys Pro Thr Val He Ser His Arg 
260 265 270 



Leu His Phe Pro Glu Gly Gly Ser Leu Ala Ala Leu Thr Ala His Gin 
275 280 285 



40 



Ala Cys His Leu Pro Leu Glu Thr Phe Thr Arg His Arg Gin Pro Arg 
290 295 300 



45 



Gly Trp Glu Gin Leu Glu Gin Cys Gly Tyr Pro Val Gin Arg Leu Val 

305 310 315 320 

Ala Leu Tyr Leu Ala Ala Arg Leu Ser Trp Asn Gin Val Asp Gin Val 

325 330 335 



50 



55 
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lie Arg Asn Ala Leu Ala Ser Pro Gly Ser Gly Gl y Asp Leu Gly Glu 
340 345 350 

Ala He Arg Glu Gin Pro Glu Gin Ala Arg Leu Ala Leu Thr Leu Ala 

355 360 365 



70 



Ala Ala Glu Ser Glu Arg Phe Val Arg Gin Gly Thr Gly Asn Asp Glu 

370 375 380 

Ala Gly Ala Ala Asn Ala Asp Val Val Ser Leu Thr Cys Pro Val Ala 

385 390 395 400 



75 



Ala Gly Glu Cys Ala Gly Pro Ala Asp Ser Gly Asp Ala Leu Leu Glu 
405 410 415 



Arg Asn Tyr Pro Thr Gly Ala Glu Phe Leu Gly Asp Gly Gly'Asp Val 
420 425 430 



20 



Ser Phe Ser Thr Arg Gly Thr Gin Asn Trp Thr Val Glu Arg Leu Leu 
435 440 445 



25 



Gin Ala His Arg Gin Leu Glu Glu Arg Gly Tyr Val Phe Val Gly Tyr 

450 455 460 

His Gly Thr Phe Leu Glu Ala Ala Gin Ser He Val Phe Gly Gly Val 

465 470 475 480 



30 



Arg Ala Arg Ser Gin Asp Leu Asp Ala lie Trp Arg Gly Phe Tyr He 
485 490 495 

Ala Gly Asp Pro Ala Leu Ala Tyr Gly Tyr Ala Gin Asp Gin Glu Pro 
500 505 510 



35 



Asp Ala Arg Gly Arg He Arg Asn Gly Ala Leu Leu Arg Val Tyr Val 
515 520 525 



40 



Pro Arg Ser Ser Leu Pro Gly Phe Tyr Arg Thr Ser Leu Thr Leu Ala 
530 535 540 

Ala Pro Glu Ala Ala Gly Glu Val Glu Arg Leu He Gly His Pro Leu 

545 550 555 560 



45 



Pro Leu Arg Leu Asp Ala He Thr Gly Pro Glu Glu Glu Gly Gly Arg 
565 570 575 

Leu Glu Thr lie Leu Gly Trp Pro Leu Ala Glu Arg Thr Val Val He 
580 585 590 



50 



55 
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Pro Ser Ala He Pro Thr Asp Pro Arg Asn Val Gly Gly Asp Leu Asp 
595 600 605 

Pro Ser Ser lie Pro Asp Lys Glu Gin Ala lie Ser Ala Leu Pro Asp 
5 610 615 620 

Tyr Ala Ser Gin Pro Gly Lys Pro Pro Arg Glu Asp Pro Leu Ala Pro 
625 630 635 640 

70 Gly Lys Pro Pro Arg Glu Asp Leu Lys Xaa Glu Phe 

645 650 

(2) INFORMATION FOR SEQ ID N0:23: 

75 <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: 1 inear 



20 



(ii) MOLECULE TYPE: ONA (genomic) 



25 <xi) SEQUENCE DESCRIPTION: SEQ ID N0:23: 

CCGGGCTGAC TAAGGGGATT TTAGGATTTG TGTTCACGCT CACCGTGC 48 
(2) INFORMATION FOR SEQ ID NO: 24: 

30 

(i) SEQUENCE CHARACTERISTICS: 
' (A) LENGTH: 2004 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:24: 
ATGCACCTGA TACCCCATTG GATCCCCCTG GTCGCCAGCC TCGGCCTGCT CGCCGGCGGC 60 
45 TCGTCCGCGT CCGCCGCCGA GGAAGCTTTC GACCTCTGGA ACGAATGCGC CAAAGCCTGC 120 

GTGCTCGACC TCAAGGACGG CGTGCGTTCC AGCCGCATGA GCGTCGACCC GGCCATCGCC 180 



50 



55 
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GACACCAACG GCCAGGGCGT GCTGCACTAC 
CTCAAGCTGG CCATCGACAA CGCCCTCAGC 

5 

GAAGGCGGCG TCGAGCCGAA CAAGCCGGTG 
AGTTGGTCGC TGAACTGGCT GGTACCGATC 

w TTCATCCACG AACTGAACGC CGGCAACCAG 

GAGATGGGCG ACGAGTTGCT GGCGAAGCTG 
CACGAGAGCA ACGAGATGCA GCCGACGCTC 

75 ATGGCCCAGA CCCAGCCGCG CCGGGAAAAG 

TTGTGCCTGC TCGACCCGCT GGACGGGGTC 
CTCGACGATA CCTGGGAAGG CAAGATCTAC 

20 

GACCTGGACA TCAAACCCAC GGTCATCAGT 
CTGGCCGCGC TGACCGCGCA CCAGGCTTGC 
25 CGCCAGCCGC GCGGCTGGGA ACAACTGGAG 

GCCCTCTACC TGGCGGCGCG GCTGTCGTGG 
CTGGCCAGCC CCGGCAGCGG CGGCGACCTG 

30 

GCCCGTCTGG CCCTGACCCT GGCCGCCGCC 
GGCAACGACG AGGCCGGCGC GGCCAACGCC 
35 GCCGGTGAAT GCGCGGGCCC GGCGGACAGC 

ACTGGCGCGG AGTTCCTCGG CGACGGCGGC 
AACTGGACGG TGGAGCGGCT GCTCCAGGCG 

40 

TTCGTCGGCT ACCACGGCAC CTTCCTCGAA 
CGCGCGCGCA GCCAGGACCT CGACGCGATC 
45 GCGCTGGCCT ACGGCTACGC CCAGGACCAG 

GGTGCCCTGC TGCGGGTCTA TGTGCCGCGC 

50 



TCCATGGTCC TGGAGGGCGG CAACGACGCG 240 

ATCACCAGCG ACGGCCTGAC CATCCGCCTC 300 

CGCTACAGCT ACACGCGCCA GGCGCGCGGC 360 

GGCCACGAGA AGCCCTCGAA CATCAAGGTG 420 

CTCAGCCACA TGTCGCCGAT CTACACCATC 480 

GCGCGCGATG CCACCTTCTT CGTCAGGGCG 540 

GCCATCAGCC ATGCCGGGGT CAGCGTGGTC 600 

CGCTGGAGCG AATGGGCCAG CGGCAAGGTG 660 

TACAACTACC TCGCCCAGCA ACGCTGCAAC 720 

CGGGTGCTCG CCGGCAACCC GGCGAAGCAT 780 

CATCGCCTGC ACTTTCCCGA GGGCGGCAGC 840 

CACCTGCCGC TGGAGACTTT CACCCGTCAT 900 

CAGTGCGGCT ATCCGGTGCA GCGGCTGGTC 960 

AACCAGGTCG ACCAGGTGAT CCGCAACGCC 1020 

GGCGAAGCGA TCCGCGAGCA GCCGGAGCAG 1080 

GAGAGCGAGC GCTTCGTCCG GCAGGGCACC 1140 

GACGTGGTGA GCCTGACCTG CCCGGTCGCC 1200 

GGCGACGCCC TGCTGGAGCG CAACTATCCC 1260 

GACGTCAGCT TCAGCACCCG CGGCACGCAG 1320 

CACCGCCAAC TGGAGGAGCG CGGCTATGTG 1380 

GCGGCGCAAA GCATCGTCTT CGGCGGGGTG 1440 

TGGCGCGGTT TCTATATCGC CGGCGATCCG 1500 

GAACCCGACG CACGCGGCCG GATCCGCAAC 1560 

TCGAGCCTGC CGGGCTTCTA CCGCACCAGC 1620 
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CTGACCCTGG CCGCGCCGGA GGCGGCGGGC GAGGTCGAAC GGCTGATCGG CCATCCGCTG 1680 

CCGCTGCGCC TGGACGCCAT CACCGGCCCC GAGGAGGAAG GCGGGCGCCT GGAGACCATT 1740 

5 CTCGGCTGGC CGCTGGCCGA GCGCACCGTG GTGATTCCCT CGGCGATCCC CACCGACCCG 1800 

CGCAACGTCG GCGGCGACCT CGACCCGTCC AGCATCCCCG ACAAGGAACA GGCGATCAGC 1860 

GCCCTGCCGG ACTACGCCAG CCAGCCCGGC AAACCGCCGC GCGAGGACCC GCTAGCACCC 1920 

W 

GGGCTGACTA AGGGGATTTT AGGATTTGTG TTCACGCTCA CCGTGCCCGG GAAACCGCCG 1980 

CGCGAGGACC TGAAGTAAGA ATTC 2004 
75 (2) INFORMATION FOR SEQ ID N0:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 668 amino acids 

(B) TYPE: amino acid 

20 (C) STRANOEONESS: single 

(D) TOPOLOGY: linear 

(ii ) MOLECULE TYPE: protein 

25 



30 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:25: 

Met His Leu lie Pro His Trp lie Pro Leu Val Ala Ser Leu Gly Leu 
15 10 15 

Leu Ala Gly Gly Ser Ser Ala Ser Ala Ala Glu Gl u Ala Phe Asp Leu 
20 25 30 

Trp Asn Glu Cys Ala Lys Ala Cys Val Leu Asp Leu Lys Asp Gly Val 
35 40 45 

Arg Ser Ser Arg Met Ser Val Asp Pro Ala He Ala Asp Thr Asn Gly 
50 55 60 

Gin Gly Val Leu His Tyr Ser Met Val Leu Glu Gly Gly Asn Asp Ala 
65 _ 70 75 80 

Leu Lys Leu Ala lie Asp Asn Ala Leu Ser lie Thr Ser Asp Gly Leu 
45 85 90 95 

Thr lie Arg Leu Glu Gly Gly Val Glu Pro Asn Lys Pro Val Arg Tyr 
100 105 110 

50 



40 



55 
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10 



75 



20 



25 



30 



35 



40 



45 



Ser Tyr Thr Arg Gin Ala Arg Gly Ser Trp Ser Leu Asn Trp Leu Val 
H5 120 125 

Pro lie Gly His Glu Lys Pro Ser Asn He Lys Val Phe lie His Glu 
130 135 140 

Leu Asn Ala Gly Asn Gin Leu Ser His Met Ser Pro lie Tyr Thr lie 

155 160 



145 



150 



Glu Met Gly Asp Glu Leu Leu Ala Lys Leu Ala Arg Asp Ala Thr Phe 
165 170 175 

Phe Val Arg Ala His Glu Ser Asn Glu Met Gin Pro Thr Leu Ala He 
180 185 190 

Ser His Ala Gly Val Ser Val Val Met Ala Gin Thr Gin Pro Arg Arg 
195 200 205 

Glu Lys Arg Trp Ser Glu Trp Ala Ser Gly Lys Val Leu Cys Leu Leu 
210 215 220 

Asp Pro Leu Asp Gly Val Tyr Asn Tyr Leu Ala Gin Gin Arg Cys Asn 
225 230 235 240 

Leu Asp Asp Thr Trp Glu Gly Lys He Tyr Arg Val Leu Ala Gly Asn 
245 250 255 

Pro Ala Lys His Asp Leu Asp He Lys Pro Thr Val lie Ser His Arg 
260 265 270 

Leu His Phe Pro Glu Gly Gly Ser Leu Ala Ala Leu Thr Ala His Gin 
275 280 285 

Ala Cys His Leu Pro Leu Glu Thr Phe Thr Arg His Arg Gin Pro Arg 
290 295 300 

Gly Trp Glu Gin Leu Glu Gin Cys Gly Tyr Pro Val Gin Arg Leu Val 
305 310 315 320 

Ala Leu Tyr Leu Ala Ala Arg Leu Ser Trp Asn Gin Val Asp Gin Val 
325 330 335 

He Arg Asn Ala Leu Ala Ser Pro Gly Ser Gly Gly Asp Leu Gly Glu 



340 



345 



350 



Ala He Arg Glu Gin Pro Glu Gin Ala Arg Leu Ala Leu Thr Leu Ala 
355 360 365 



50 



55 
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10 



Ala Ala Glu Ser Glu Arg Phe Val Arg Gin Gly Thr Gly Asn Asp Glu 
370 375 380 

Ala Gly Ala Ala Asn Ala Asp Val Val Ser Leu Thr Cys Pro Val Ala 
385 390 395 400 

Ala Gly Glu Cys Ala Gly Pro Ala Asp Ser Gly Asp Ala Leu Leu Glu 
405 410 415 

Arg Asn Tyr Pro Thr Gly Ala Glu Phe Leu Gly Asp Gly Gly Asp Val 
420 425 430 



75 



Ser Phe Ser Thr Arg Gly Thr Gin Asn Trp Thr Val Glu Arg Leu Leu 
435 440 445 



50 



Gin Ala His Arg Gin Leu Glu Glu Arg Gly Tyr Val Phe Val Gly Tyr 
450 455 460 



20 



His Gly Thr Phe Leu Glu Ala Ala Gin Ser lie Val Phe Gly Gly Val 
465 470 475 480 



25 



Arg Ala Arg Ser Gin Asp Leu Asp Ala lie Trp Arg Gly Phe Tyr He 

485 490 495 

Ala Gly Asp Pro Ala Leu Ala Tyr Gly Tyr Ala Gin Asp Gin Glu Pro 

500 505 510 



30 



Asp Ala Arg Gly Arg He Arg Asn Gly Ala Leu Leu Arg Val Tyr Val 
515 520 525 



Pro ^Arg Ser Ser Leu Pro Gly Phe Tyr Arg Thr Ser Leu Thr Leu Ala 
530 535 540 



35 



Ala Pro Glu Ala Ala Gly Glu Val Glu Arg Leu He Gly His Pro Leu 
545 550 555 560 



40 



Pro Leu Arg Leu Asp Ala lie Thr Gly Pro Glu Glu Glu Gly Gly Arg 

565 570 575 

Leu Glu Thr He Leu Gly Trp Pro Leu Ala Glu Arg Thr Val Val He 

580 585 590 



45 



Pro Ser Ala lie Pro Thr Asp Pro Arg Asn Val Gly Gly Asp Leu Asp 
595 600 605 

Pro Ser Ser He Pro Asp Lys Glu Gin Ala lie Ser Ala Leu Pro Asp 
610 615 620 



55 
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w 



75 



20 



25 



30 



35 



Tvr Ala Ser Gin Pro Gly Lys Pro Pro Arg Glu Asp Pro Leu Ala Pro 
625 630 635 6*0 

Gly Leu Thr Lys Gly He Leu Gly Phe Val Phe Thr Leu Thr Val Pro 
645 650 655 

Gly Lys Pro Pro Arg Glu Asp Leu Lys Xaa Glu Phe 
660 665 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 27 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:26: 
GCACCCGGGA TCCCGTCAGG CCCCCTC 
(2) INFORMATION FOR SEQ 10 N0:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:27: 

40 GCACCCGGGC TCCCTCTTGA GCTTCCT 

(2) INFORMATION FOR SEQ- ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 2238 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 



27 



27 



55 
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15 



20 



25 



30 



35 



40 



45 



50 



(ii) MOLECULE TYPE: ONA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:28: 

ATGCACCTGA TACCCCATTG GATCCCCCTG GTCGCCAGCC TCGGCCTGCT CGCCGGCGGC 60 

TCGTCCGCGT CCGCCGCCGA GGAAGCTTTC GACCTCTGGA ACGAATGCGC CAAAGCCTGC 120 

GTGCTCGACC TCAAGGACGG CGTGCGTTCC AGCCGCATGA GCGTCGACCC GGCCATCGCC 180 

GACACCAACG GCCAGGGCGT GCTGCACTAC TCCATGGTCC TGGAGGGCGG CAACGACGCG 240 

CTCAAGCTGG CCATCGACAA CGCCCTCAGC ATCACCAGCG ACGGCCTGAC CATCCGCCTC 300 

GAAGGCGGCG TCGAGCCGAA CAAGCCGGTG CGCTACAGCT ACACGCGCCA GGCGCGCGGC 360 

AGTTGGTCGC TGAACTGGCT GGTACCGATC GGCCACGAGA AGCCCTCGAA CATCAAGGTG 420 

TTCATCCACG AACTGAACGC CGGCAACCAG CTCAGCCACA TGTCGCCGAT CTACACCATC 480 

GAGATGGGCG ACGAGTTGCT GGCGAAGCTG GCGCGCGATG CCACCTTCTT CGTCAGGGCG 540 

CACGAGAGCA ACGAGATGCA GCCGACGCTC GCCATCAGCC ATGCCGGGGT CAGCGTGGTC 600 

ATGGCCCAGA CCCAGCCGCG CCGGGAAAAG CGCTGGAGCG AATGGGCCAG CGGCAAGGTG 660 

TTGTGCCTGC TCGACCCGCT GGACGGGGTC TACAACTACC TCGCCCAGCA ACGCTGCAAC 720 

CTCGACGATA CCTGGGAAGG CAAGATCTAC CGGGTGCTCG CCGGCAACCC GGCGAAGCAT 780 

GACCTGGACA TCAAACCCAC GGTCATCAGT CATCGCCTGC ACTTTCCCGA GGGCGGCAGC 840 

CTGGCCGCGC TGACCGCGCA CCAGGCTTGC CACCTGCCGC TGGAGACTTT CACCCGTCAT 900 

CGCCAGCCGC GCGGCTGGGA ACAACTGGAG CAGTGCGGCT ATCCGGTGCA GCGGCTGGTC 960 

GCCCTCTACC TGGCGGCGCG GCTGTCGTGG AACCAGGTCG ACCAGGTGAT CCGCAACGCC 1020 

CTGGCCAGCC CCGGCAGCGG CGGCGACCTG GGCGAAGCGA TCCGCGAGCA GCCGGAGCAG 1080 

GCCCGTCTGG CCCTGACCCT GGCCGCCGCC GAGAGCGAGC GCTTCGTCCG GCAGGGCACC 1140 

GGCAACGACG AGGCCGGCGC GGCCAACGCC GACGTGGTGA GCCTGACCTG CCCGGTCGCC 1200 

GCCGGTGAAT GCGCGGGCCC GGCGGACAGC GGCGACGCCC TGCTGGAGCG CAACTATCCC 1260 



55 
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ACTGGCGCGG AGTTCCTCGG CGACGGCGGC GACGTCAGCT TCAGCACCCG CGGCACGCAG 1320 

AACTGGACGG TGGAGCGGCT GCTCCAGGCG CACCGCCAAC TGGAGGAGCG CGGCTATGTG 1380 

5 

TTCGTCGGCT ACCACGGCAC CTTCCTCGAA GCGGCGCAAA GCATCGTCTT CGGCGGGGTG 1440 

CGCGCGCGCA GCCAGGACCT CGACGCGATC TGGCGCGGTT TCTATATCGC CGGCGATCCG 1500 

10 GCGCTGGCCT ACGGCTACGC CCAGGACCAG GAACCCGACG CACGCGGCCG GATCCGCAAC 1560 

GGTGCCCTGC TGCGGGTCTA TGTGCCGCGC TCGAGCCTGC CGGGCTTCTA CCGCACCAGC 1620 

CTGACCCTGG CCGCGCCGGA GGCGGCGGGC GAGGTCGAAC GGCTGATCGG CCATCCGCTG 1680 

75 

CCGCTGCGCC TGGACGCCAT CACCGGCCCC GAGGAGGAAG GCGGGCGCCT GGAGACCATT 1740 

CTCGGCTGGC CGCTGGCCGA GCGCACCGTG GTGATTCCCT CGGCGATCCC CACCGACCCG 1800 

20 CGCAACGTCG GCGGCGACCT CGACCCGTCC AGCATCCCCG ACAAGGAACA GGCGATCAGC I860 

GCCCTGCCGG ACTACGCCAG CCAGCCCGGC AAACCGCCGC GCGAGGACCC GCTAGCACCC 1920 

GGGATCCCGT CAGGCCCCCT CAAAGCCGAG ATCGCACAGA GACTTGAAGA TGTCTTTGCA 1980 

25 

GGGAAGAACA CCGATCTTGA GGTTCTCATG GAATGGCTAA AGACAAGACC AATCCTGTCA 2040 

CCTCTGACTA AGGGGATTTT AGGATTTGTG TTCACGCTCA CCGTGCCCAG TGAGCGAGGA 2100 

30 CTGCAGCGTA GACGCTTTGT CCAAAATGCC CTTAATGGGA ACGGGGATCC AAATAACATG 2160 

GACAAAGCAG TTAAACTGTA TAGGAAGCTC AAGAGGGAGC CCGGGAAACC GCCGCGCGAG 2220 

GACCTGAAGT AAGAATTC 2238 
(2) INFORMATION FOR SEQ ID NO: 29: 



35 



<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 746 amino acids 
40 (B) TYPE: amino acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 



45 



50 



(ii) MOLECULE TYPE : protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:29: 

Met His Leu lie Pro His Trp lie Pro Leu Val Ala Ser Leu Gly Leu 
15 10 15 

Leu Ala Gly Gly Ser Ser Ala Ser Ala Ala Glu Glu Ala Phe Asp Leu 
20 25 30 

Trp Asn Glu Cys Ala Lys Ala Cys Val Leu Asp Leu Lys Asp Gly Val 
35 40 45 

Arg Ser Ser Arg Met Ser Val Asp Pro Ala lie Ala Asp Thr Asn Gly 
50 55 60 

Gin Gly Val Leu His Tyr Ser Met Val Leu Glu Gly Gly Asn Asp Ala 
65 70 75 80 

Leu Lys Leu Ala lie Asp Asn Ala Leu Ser He Thr Ser Asp Gly Leu 
85 90 95 

Thr He Arg Leu Glu Gly Gly Val Glu Pro Asn Lys Pro Val Arg Tyr 
100 105 HO 

Ser Tyr Thr Arg Gin Ala Arg Gly Ser Trp Ser Leu Asn Trp Leu Val 
115 120 125 

Pro He Gly His Glu Lys Pro Ser Asn He Lys Val Phe He His Glu 
130 135 140 

Leu Asn Ala Gly Asn Gin Leu Ser His Met Ser Pro lie Tyr Thr lie 
145 150 155 160 

Glu Met Gly Asp Glu Leu Leu Ala Lys Leu Ala Arg Asp Ala Thr Phe 
165 170 175 

Phe Val Arg Ala His Glu Ser Asn Glu Met Gin Pro Thr Leu Ala He 
180 185 190 

Ser His Ala Gly Val Ser Val Val Met Ala Gin Thr Gin Pro Arg Arg 
195 200 205 

Glu Lys Arg Trp Ser Glu Trp Ala Ser Gly Lys Val Leu Cys Leu Leu 
210 215 220 

Asp Pro Leu Asp Gly Val Tyr Asn Tyr Leu Ala Gin Gin Arg Cys Asn 
225 230 235 240 

Leu Asp Asp Thr Trp Glu Gly Lys He Tyr Arg Val Leu Ala Gly Asn 
245 250 255 
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70 



75 



20 



25 



30 



35 



40 



Pro Ala Lys His Asp Leu Asp lie Lys Pro Thr Val He Ser His Arg 
260 265 270 

Leu His Phe Pro Glu Gly Gly Ser Leu Ala Ala Leu Thr Ala His Gin 
275 280 285 

Ala Cys His Leu Pro Leu Glu Thr Phe Thr Arg His Arg Gin Pro Arg 
290 295 300 

Gly Trp Glu Gin Leu Glu Gin Cys Gly Tyr Pro Val Gin Arg Leu Val 
305 310 315 320 

Ala Leu Tyr Leu Ala Ala Arg Leu Ser Trp Asn Gin Val Asp Gin Val 

325 330 335 

lie Arg Asn Ala Leu Ala Ser Pro Gly Ser Gly Gly Asp Leu Gly Glu 
340 345 350 

Ala He Arg Glu Gin Pro Glu Gin Ala Arg Leu Ala Leu Thr Leu Ala 
355 360 365 

Ala Ala Glu Ser Glu Arg Phe Val Arg Gin Gly Thr Gly Asn Asp Glu 
370 375 380 

Ala Gly Ala Ala Asn Ala Asp Val Val Ser Leu Thr Cys Pro Val Ala 
385 390 395 ^ 400 

Ala Gly Glu Cys Ala Gly Pro Ala Asp Ser Gly Asp Ala Leu Leu Glu 
405 410 415 

Arg Asn Tyr Pro Thr Gly Ala Glu Phe Leu Gly Asp Gly Gly Asp Val 
420 425 4 30 

Ser Phe Ser Thr Arg Gly Thr Gin Asn Trp Thr Val Glu Arg Leu Leu 
435 440 445 

Gin Ala His Arg Gin Leu Glu Glu Arg Gly Tyr Val Phe Val Gly Tyr 
450 455 460 

His Gly Thr Phe Leu Glu Ala Ala Gin Ser He Val Phe Gly Gly Val 

475 480 



465 



470 
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Arg Ala Arg Ser Gin Asp Leu Asp Ala He Trp Arg Gly Phe Tyr He 

485 490 495 

Ala Gly Asp Pro Ala Leu Ala Tyr Gly Tyr Ala Gin Asp Gin Glu Pro 

500 505 510 
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50 



Asp Ala Arg Gly Arg lie Arg Asn Gly Ala Leu Leu Arg Val Tyr Val 

515 520 525 

Pro Arg Ser Ser Leu Pro Gly Phe Tyr Arg Thr Ser Leu Thr Leu Ala 
530 535 540 



70 



Ala Pro Glu Ala Ala Gly Glu Val Glu Arg Leu lie Gly His Pro Leu 
545 550 555 560 

Pro Leu Arg Leu Asp Ala lie Thr Gly Pro Glu Glu Glu Gly Gly Arg 
565 570 575 



75 



Leu Glu Thr He Leu Gly Trp Pro Leu Ala Glu Arg Thr Val Val lie 
580 585 590 

Pro Ser Ala lie Pro Thr Asp Pro Arg Asn Val Gly Gly Asp Leu Asp 
595 600 605 



20 



Pro Ser Ser lie Pro Asp Lys Glu Gin Ala lie Ser Ala Leu Pro Asp 
610 615 620 



25 



Tyr Ala Ser Gin Pro Gly Lys Pro Pro Arg Glu Asp Pro Leu Ala Pro 
625 630 635 640 

Gly He Pro Ser Gly Pro Leu Lys Ala Glu He Ala Gin Arg Leu Glu 
645 650 655 



30 



Asp Val Phe Ala Gly Lys Asn Thr Asp Leu Glu Val Leu Met Glu Trp 
660 665 670 

Leu Lys Thr Arg Pro lie Leu Ser Pro Leu Thr Lys Gly lie Leu Gly 
675 680 685 



35 



Phe Val Phe Thr Leu Thr Val Pro Ser Glu Arg Gly Leu Gin Arg Arg 
690 695 700 



Arg Phe Val Gin Asn Ala Leu Asn Gly Asn Gly Asp Pro Asn Asn Met 
705 710 715 720 



40 



Asp Lys Ala Val Lys Leu Tyr Arg Lys Leu Lys Arg Glu Pro Gly Lys 
725 730 735 



45 



Pro Pro Arg Glu Asp Leu Lys Xaa Glu Phe 
740 745 
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(2) INFORMATION FOR SEQ ID N0:30: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 14 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 
(0) TOPOLOGY: linear 

10 <ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:30: 
CTAGACTAGT CTAG 14 
(2) INFORMATION FOR SEQ ID N0:31: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 
<C) STRANOEDNESS: single 
(D) TOPOLOGY: linear 



25 



45 



(ii) MOLECULE TYPE: DNA (genomic) 



30 (xi) SEQUENCE DESCRIPTION: SEQ ID N0:31: 

GGCGGCAGAA AGAGC 15 
(2) INFORMATION FOR SEQ ID NO: 32: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
40 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:32: 

Met Lys Ala Asn Leu Leu Val Leu Leu Cys Ala Leu Ala Ala Ala Asp 
15 10 15 
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Ala Asp Thr lie Cys 
20 

(2) INFORMATION FOR SEQ ID NO: 33: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 

75 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:33: 
GGCAGAAAGA TGAAGGCAAA CCTACTGGTC CTGTTATGTG CACTTGCAGC TGCAGATGCA 60 
GACACAATAT GC 72 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

<xi) SEQUENCE DESCRIPTION: SEQ ID N0:34: 

Gl y Arg Lys Met Lys Ala Asn Leu Leu Val Leu Leu Cys Ala Leu Ala 
15 10 15 

Ala Ala Asp Ala Asp Thr lie Cys 
20 

(2) INFORMATION FOR SEQ Ip NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 63 base pairs 

<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 
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(ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:35: 



ATGAAGGCAA ACCTACTGGT CCTGTTATGT GCACTTGCAG CTGCAGATGC AGACACAATA 



60 



TGA 



63 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:36: 

Met Lys Ala Asn Leu Leu Val Leu Leu Cys Ala Leu Ala Ala Ala Asp 
^ 5 10 15 

Ala Asp Thr He Xaa 



(2) INFORMATION FOR SEQ 10 N0:37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:37: 

His His Ala Asn Glu Asn lie Phe Tyr Cys Pro He Ala He Met Ser 
15 10 15 

Ala Leu Ala Met Val Tyr Leu Gly Ala Lys Asp 



20 



20 



25 
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(2) INFORMATION FOR SEQ ID NO: 38: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:38: 
CACCATGCCA ATGAGAACAT CTTCTACTGC CCCATTGCCA TCATGTCAGC TCTAGCCATG 
GTATACCTGG GTGCAAAAAG C 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:39: 

His His Ala Asn Glu Asn lie Phe Tyr Cys Pro lie Ala He Met Ser 
15 10 15 

Ala Leu Ala Met Val Tyr Leu Gly Ala Lys Ser 
20 25 

(2) INFORMATION FOR SEQ ID N0:40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78-base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



54 



EP 0 532 090 A2 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:40: 
GGCAGAAAGA TGAAGGCAAA CCTACTGGTC CTGTTATGTG CACTTGCAGC TGCAGATGCA 
GACACAATAT GCATGATG 
(2) INFORMATION FOR SEQ ID N0:41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41: 

Gly Arg Lys Met Lys Ala Asn Leu Leu Val Leu Leu Cys Ala Leu Ala 
15 10 15 

Ala Ala Asp Ala Asp Thr lie Cys Met Met 
20 25 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 
GGCATGAAGG CAAACCTACT GGTCCTGTTA TGTGCACTTG CAGCTGCAGA TGCAGACACA 
ATATGCATGA TG 
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(2) INFORMATION FOR SEQ 10 N0:43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:43: 

Gly Met Lys Ala Asn Leu Leu Val Leu Leu Cys Ala Leu Ala Ala Ala 
15 10 15 

Asp Ala Asp Thr lie Cys Met Met 
20 

(2) INFORMATION FOR SEQ ID N0:44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 90 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
GTATGCATGC ACCATGCCAA TGAGAACATC TTCTACTGCC CCATTGCCAT CATGTCAGCT 
CTAGCCATGG TATACCTGGG TGCAAAAGAC 
(2) INFORMATION FOR SEQ ID N0:45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(H) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:45: 

Val Cys Met His His Ala Asn Glu Asn lie Phe Tyr Cys Pro lie Ala 
15 10 15 

He Met Ser Ala Leu Ala Met Val Tyr Leu Gly Ala Lys Asp 
2 ° 25 30 

(2) INFORMATION FOR SEQ 10 NO: 46: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 147 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:46: 
ATGAAGGCAA ACCTACTGGT CCTGTTATGT GCACTTGCAG CTGCAGATGC AGACACAATA 
TGCCACCATG CCAATGAGAA CATCTTCTAC TGCCCCATTG CCATCATGTC AGCTCTAGCC 
„ ATGGTATACC TGGGTGCAAA AGACAGC 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



57 



EP 0 532 090 A2 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Met Lys Ala Asn Leu Leu Val Leu Leu Cys Ala Leu Ala Ala Ala Asp 
1 5 10 15 

Ala Asp Thr He Cys His His Ala Asn Glu Asn lie Phe Tyr Cys Pro 
20 25 30 

He Ala He Met Ser Ala Leu Ala Met Val Tyr Leu Gly Ala Lys Asp 
35 40 45 

Ser 



(2) INFORMATION FOR SEQ ID NO: 48: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 70 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(H) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:48: 
CCTATCAGAA ACGAATGGGG GTGCAGATGC AACGGTTCAA GCGCGAGGAC CTGAAGTAAG 
AATTCGAGCT 

(2) INFORMATION FOR SEQ ID N0:49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2013 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 



58 



50 



EP 0 532 090 A2 



« 



60 
120 
180 
240 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:49: 
AT6GCC6AGG AAGCTTTCGA CCTCTGGAAC GAATGCGCCA AAGCCTGCGT GCTCGACCTC 
5 AAGGACGGCG TGCGTTCCAG CCGCATGAGC GTCGACCCGG CCATCGCCGA CACCAACGGC 

CAGGGCGTGC TGCACTACTC CATGGTCCTG GAGGGCGGCA ACGACGCGCT CAAGCTGGCC 
ro ATCGACAACG CCCTCAGCAT CACCAGCGAC GGCCTGACCA TCCGCCTCGA AGGCGGCGTC 

6AGCCGAACA AGCCGGTGCG CTACAGCTAC ACGCGCCAGG CGCGCGGCAG TTGGTCGCTG 300 

AACTGGCTGG TACCGATCGG CCACGAGAAG CCCTC6AACA TCAAGGTGTT CATCCACGAA 360 

' 5 CTGAACGCCG GCAACCAGCT CAGCCACATG TCGCCGATCT ACACCATCGA GATGGGCGAC 420 

GAGTTGCTGG CGAAGCTGGC GCGCGATGCC ACCTTCTTCG TCAGGGCGCA CGAGAGCAAC 480 

GAGATGCAGC CGACGCTCGC CATCAGCCAT GCCGGGGTCA GCGTGGTCAT GGCCCAGACC 540 

CAGCCGCGCC GGGAAAAGCG CTGGAGCGAA TGGGCCAGCG GCAAGGTGTT GTGCCTGCTC 600 

GACCCGCTGG ACGGGGTCTA CAACTACCTC GCCCAGCAAC GCTGCAACCT CGACGATACC 660 

25 TGGGAAGGCA AGATCTACCG GGTGCTCGCC GGCAACCCGG C6AAGCATGA CCTGGACATC 720 

AAACCCACGG TCATCAGTCA TCGCCTGCAC TTTCCCGAGG GCGGCAGCCT GGCCGCGCTG 780 

ACCGCGCACC AGGCTTGCCA CCTGCCGCTG GAGACTTTCA CCCGTCATCG CCAGCCGCGC 840 

GGCTGGGAAC AACTGGAGCA GTGCGGCTAT CCGGTGCAGC GGCTGGTCGC CCTCTACCTG 900 

GCGGCGCGGC TGTCGTGGAA CCAGGTCGAC CAGGTGATCC GCAACGCCCT GGCCAGCCCC 960 

35 GGCAGCGGCG GCGACCTGGG CGAAGCGATC CGCGAGCAGC CGGAGCAGGC CCGTCTGGCC 1020 

CTGACCCTGG CCGCCGCCGA GAGCGAGCGC TTCGTCCGGC AGGGCACCGG CAACGAC6AG 1080 

GCCGGCGCGG CCAACGCCGA CGTGGTGAGC CTGACCTGCC CGGTCGCCGC CGGTGAATGC 1140 

40 

GCGGGCCCGG CGGACAGCGG CGACGCCCTG CTGGAGCGCA ACTATCCCAC TGGCGCGGAG 1200 

TTCCTCGGCG ACGGCGGCGA CGTCAGCTTC AGCACCCGCG GCAGTCTTCT AACCGAGGTC 1260 

45 GAAACGTACG TTCTCTCTAT CATCCCGTCA GGCCCCCTCA AAGCCGAGAT CGCACAGAGA 1320 

CTTGAAGATG TCTTTGCAGG GAAGAACACC 6ATCTTGAGG TTCTCATGGA ATGGCTAAAG 1380 
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ACAAGACCAA TCCTGTCACC TCTGACTAAG GGGATTTTAG GATTTGTGTT CACGCTCACC 1440 
GTGCCCAGTG AGCGAGGACT GCAGCGTAGA CGCTTTGTCC AAAATGCCCT TAATGGGAAC 1500 
GGGGATCCAA ATAACATGGA CAAAGCAGTT AAACTGTATA GGAAGCTCAA GAGGGAGATA 1560 
ACATTCCATG GGGCCAAAGA AATCTCACTC AGTTATTCTG CTGGTGCACT TGCCAGTTGT 1620 
ATGGGCCTCA TATACAACAG GATGGGGGCT GTGACCACTG AAGTGGCATT TGGCCTGGTA 1680 
TGTGCAACCT GTGAACAGAT TGCTGACTCC CAGCATCGGT CTCATAGGCA AATGGTGACA 1740 
ACAACCAACC CACTAATCAG ACATGAGAAC AGAATGGTTT TAGCCAGCAC TACAGCTAAG 1800 
GCTATGGAGC AAATGGCTGG ATCGAGTGAG CAAGCAGCAG AGGCCATGGA GGTTGCTAGT 1860 
CAGGCTAGGC AAATGGTGCA AGCGATGAGA ACCATTGGGA CTCATCCTAG CTCCAGTGCT 1920 
20 GGTCTGAAAA ATGATCTTCT TGAAAATTTG CAGGCCTATC AGAAACGAAT GGGGGTGCAG 1980 

ATGCAACGGT TCAAGCGCGA GGACCTGAAG TAA 2013 
(2) INFORMATION FOR SEQ ID N0:50: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 671 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



15 
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45 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:50: 

Met Ala Glu Glu Ala Phe Asp Leu Trp Asn Glu Cys Ala Lys Ala Cys 
15 10 15 

Val Leu Asp Leu Lys Asp Gly Val Arg Ser Ser Arg Met Ser Val Asp 
20 25 30 

Pro Ala He Ala Asp Thr Asn Gly Gin Gly Val Leu His Tyr Ser Met 
35 40 45 

Val Leu Glu Gly Gly Asn Asp Ala Leu Lys Leu Ala He Asp Asn Ala 
50 55 60 
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10 



Leu Ser lie Thr Ser Asp Gly Leu Thr lie Arg Leu Glu Gly Gly Val 
65 70 75 80 

Glu Pro Asn Lys Pro Val Arg Tyr Ser Tyr Thr Arg Gin Ala Arg Gly 
85 90 95 

Ser Trp Ser Leu Asn Trp Leu Val Pro lie Gly His Glu Lys Pro Ser 
100 105 110 

Asn He Lys Val Phe lie His Glu Leu Asn Ala Gly Asn Gin Leu Ser 
115 120 125 



75 



His Met Ser Pro He Tyr Thr lie Glu Met Gly Asp Glu Leu Leu Ala 
130 135 140 



Lys Leu Ala Arg Asp Ala Thr Phe Phe Val Arg Ala His Glu Ser Asn 
145 150 155 160 



20 



Glu Met Gin Pro Thr Leu Ala He Ser His Ala Gly Val Ser Val Val 
165 170 175 



25 



Met Ala Gin Thr Gin Pro Arg Arg Glu Lys Arg Trp Ser Glu Trp Ala 
180 185 190 

Ser Gly Lys Val Leu Cys Leu Leu Asp Pro Leu Asp Gly Val Tyr Asn 

195 200 205 



30 



Tyr Leu Ala Gin Gin Arg Cys 'Asn Leu Asp Asp Thr Trp Glu Gly Lys 
210 215 220 



He Tyr Arg Val Leu Ala Gly Asn Pro Ala Lys His Asp Leu Asp He 
225 230 235 240 



35 



Lys Pro Thr Val lie Ser His Arg Leu His Phe Pro Glu Gly Gly Ser 
245 250 255 



40 



Leu Ala Ala Leu Thr Ala His Gin Ala Cys His Leu Pro Leu Glu Thr 
260 265 270 

Phe Thr Arg His Arg Gin Pro Arg Gly Trp Glu Gin Leu Glu Glh Cys 
275 280 .285 



45 



Gly Tyr Pro Val Gin Arg Leu Val Ala Leu Tyr Leu Ala Ala Arg Leu 
290 295 300 



Ser Trp Asn Gin Val Asp Gin Val lie Arg Asn Ala Leu Ala Ser Pro 
305 310 315 320 
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Gly Ser Gly Gly Asp Leu Gly Glu Ala lie Arg Glu Gin Pro Glu Gin 
325 330 335 

Ala Arg Leu Ala Leu Thr Leu Ala Ala Ala Glu Ser Glu Arg Phe Val 
340 345 350 

Arg Gin Gly Thr Gly Asn Asp Glu Ala Gly Ala Ala Asn Ala Asp Val 

355 360 365 

Val Ser Leu Thr Cys Pro Val Ala Ala Gly Glu Cys Ala Gly Pro Ala 
370 375 380 

Asp Ser Gly Asp Ala Leu Leu Glu Arg Asn Tyr Pro Thr Gly Ala Glu 
385 390 395 400 

Phe Leu Gly Asp Gly Gly Asp Val Ser Phe Ser Thr Arg Gly Ser Leu 
405 410 415 

Leu Thr Glu Val Glu Thr Tyr Val Leu Ser lie He Pro Ser Gly Pro 
420 425 430 

Leu Lys Ala Glu He Ala Gin Arg Leu Glu Asp Val Phe Ala Gly Lys 
435 440 445 

Asn Thr Asp Leu Glu Val Leu Met Glu Trp Leu Lys Thr Arg Pro lie 
450 455 460 

Leu Ser Pro Leu Thr Lys Gly He Leu Gly Phe Val Phe Thr Leu. Thr 
465 470 475 480 

Val Pro Ser Glu Arg Gly Leu Gin Arg Arg Arg Phe Val Gin Asn Ala 
485 490 495 

Leu Asn Gly Asn Gly Asp Pro Asn Asn Met Asp Lys Ala Val Lys Leu 
500 505 510 

Tyr Arg Lys Leu Lys Arg Glu He Thr Phe His Gly Ala Lys Glu He 
515 520 525 

Ser Leu Ser Tyr Ser Ala Gly Ala Leu Ala Ser Cys Met Gly Leu He 
530 535 540 

Tyr Asn Arg Met Gly Ala Val Thr Thr Glu Val Ala Phe Gly Leu Val 
545 550 555 560 



Cys Ala Thr Cys Glu Gin He Ala Asp Ser Gin His Arg Ser His Arg 
565 570 575 
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Gin Met Val Thr Thr Thr Asn Pro Leu He Arg His Glu Asn Arg Met 
580 585 590 

Val Leu Ala Ser Thr Thr Ala Lys Ala Met Glu Gin Met Ala Gly Ser 
595 600 605 

Ser Glu Gin Ala Ala Glu Ala Met Glu Val Ala Ser Gin Ala Arg Gin 
610 615 620 

Met Val Gin Ala Met Arg Thr lie Gly Thr His Pro Ser Ser Ser Ala 
625 630 635 640 

Gly Leu Lys Asn Asp Leu Leu Glu Asn Leu Gin Ala Tyr Gin Lys Arg 
'5 645 650 655 

Met Gly Val Gin Met Gin Arg Phe Lys Arg Glu Asp Leu Lys Xaa 
660 665 670 
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(2) INFORMATION FOR SEQ 10 N0:51: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 38 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:51: 

ATACCCGCGG CATGGCGTCC CAAGGCACCA AACGGTCT 38 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

45 (ii) MOLECULE TYPE: DNA (genomic) 
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(xi ) SEQUENCE DESCRIPTION: SEQ ID N0:52: 



ATAGAATTCT TACTTCAGGT CCTCGCGATT GTCGTACTCC TCTGCATTGT CTCCGAAGAA 



60 



ATAAGATCCT TCATTACTCA T 



81 
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(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2754 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

<xi) SEQUENCE DESCRIPTION: SEQ ID N0:53: 

ATGGCCGAGG AAGCTTTCGA CCTCTGGAAC GAATGCGCCA AAGCCTGCGT GCTCGACCTC 60 

AAGGACGGCG TGCGTTCCAG CCGCATGAGC GTCGACCCGG CCATCGCCGA CACCAACGGC 120 

CAGGGCGTGC TGCACTACTC CATGGTCCTG GAGGGCGGCA ACGACGCGCT CAAGCTGGCC 180 

ATCGACAACG CCCTCAGCAT CACCAGCGAC GGCCTGACCA TCCGCCTCGA AGGCGGCGTC 240 

GAGCCGAACA AGCCGGTGCG CTACAGCTAC ACGCGCCAGG CGCGCGGCAG TTGGTCGCTG 300 

AACTGGCTGG TACCGATCGG CCACGAGAAG CCCTCGAACA TCAAGGTGTT CATCCACGAA 360 

CTGAACGCCG GCAACCAGCT CAGCCACATG TCGCCGATCT ACACCATCGA GATGGGCGAC 420 

GAGTTGCTGG CGAAGCTGGC GCGCGATGCC ACCTTCTTCG TCAGGGCGCA CGAGAGCAAC 480 

GAGATGCAGC CGACGCTCGC CATCAGCCAT GCCGGGGTCA GCGTGGTCAT GGCCCAGACC 540 

CAGCCGCGCC GGGAAAAGCG CTGGAGCGAA TGGGCCAGCG GCAAGGTGTT GTGCCTGCTC 600 

GACCCGCTGG ACGGGGTCTA CAACTACCTC GCCCAGCAAC GCTGCAACCT CGACGATACC 660 

TGGGAAGGCA AGATCTACCG GGTGCTCGCC GGCAACCCGG CGAAGCATGA CCTGGACATC 720 

AAACCCACGG TCATCAGTCA TCGCCTGCAC TTTCCCGAGG GCGGCAGCCT GGCCGCGCTG 780 

ACCGCGCACC AGGCTTGCCA CCTGCCGCTG GAGACTTTCA CCCGTCATCG CCAGCCGCGC 840 
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15 



20 



GGCTGGGAAC AACTGGAGCA GTGCGGCTAT CCGGTGCAGC GGCTGGTCGC CCTCTACCTG 900 

GCGGCGCGGC TGTCGTGGAA CCAGGTCGAC CAGGTGATCC GCAACGCCCT GGCCAGCCCC 960 

5 

GGCAGCGGCG GCGACCTGGG CGAAGCGATC CGCGAGCAGC CGGAGCAGGC CCGTCTGGCC 1020 

CTGACCCTGG CCGCCGCCGA GAGCGAGCGC TTCGTCCGGC AGGGCACCGG CAACGACGAG 1080 

70 GCCGGCGCGG CCAACGCCGA CGTGGTGAGC CTGACCTGCC CGGTCGCCGC CGGTGAATGC 1140 

GCGGGCCCGG CGGACAGCGG CGACGCCCTG CTGGAGCGCA ACTATCCCAC TGGCGCGGAG 1200 

TTCCTCGGCG ACGGCGGCGA CGTCAGCTTC AGCACCCGCG GCATGGCGTC CCAAGGCACC 1260 

AAACGGTCTT ACGAACAGAT GGAGACTGAT GGAGAACGCC AGAATGCCAC TGAAATCAGA 1320 

GCATCCGTCG GAAAAATGAT TGGTGGAATT GGACGATTCT ACATCCAAAT GTGCACAGAA 1380 

CTTAAACTCA GTGATTATGA GGGACGGTTG ATCCAAAACA GCTTAACAAT AGAGAGAATG 1440 

GTGCTCTCTG CTTTTGACGA AAGGAGAAAT AAATACCTGG AAGAACATCC CAGTGCGGGG 1500 

AAGGATCCTA AGAAAACTGG AGGACCTATA TACAGAAGAG TAAACGGAAA GTGGATGAGA 1560 

GAACTCATCC TTTATGACAA AGAAGAAATA AGGCGAATCT GGCGCCAAGC TAATAATGGT 1620 

GACGATGCAA CGGCTGGTCT GACTCACATG ATGATCTGGC ATTCCAATTT GAATGATGCA 1680 

ACTTATCAGA GGACAAGGGC TCTTGTTCGC ACCGGAATGG ATCCCAGGAT GTGCTCTCTG 1740 

ATGCAAGGTT CAACTCTCCC TAGGAGGTCT GGAGCCGCAG GTGCTGCAGT CAAAGGAGTT 1800 
GGAACAATGG TGATGGAATT GGTCAGGATG ATCAAACGTG GGATCAATGA TCGGAACTTC . 1860 

TGGAGGGGTG AGAATGGACG AAAAACAAGA ATTGCTTATG AAAGAATGTG CAACATTCTC 1920 

AAAGGGAAAT TTCAAACTGC TGCACAAAAA GCAATGATGG ATCAAGTGAG AGAGAGCCGG 1980 

GACCCAGGGA ATGCTGAGTT CGAAGATCTC ACTTTTCTAG CACGGTCTGC ACTCATATTG 2040 

AGAGGGTCGG TTGCTCACAA GTCCTGCCTG CCTGCCTGTG TGTATGGACC TGCCGTAGCC 2100 

AGTGGGTACG ACTTTGAAAG AGAGGGATAC TCTCTAGTCG GAATAGACCC TTTCAGACTG 2160 

CTTCAAAACA GCCAAGTGTA CAGCCTAATC AGACCAAATG AGAATCCAGC ACACAAGAGT 2220 

CAACTGGTGT GGATGGCATG CCATTCTGCC GCATTTGAAG ATCTAAGAGT ATTGAGCTTC 2280 
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ATCAAAGGGA CGAAGGTGGT CCCAAGAGGG AAGCTTTCCA CTAGAGGAGT TCAAATTGCT 2340 

TCCAATGAAA ATATGGAGAC TATGGAATCA AGTACACTTG AACTGAGAAG CAGGTACTGG 2400 

5 

GCCATAAGGA CCAGAAGTGG AGGAAACACC AATCAACAGA GGGCATCTGC GGGCCAAATC 2460 

AGCATACAAC CTACGTTCTC AGTACAGAGA AATCTCCCTT TTGACAGAAC AACCGTTATG 2520 

1Q GCAGCATTCA CTGGGAATAC AGAGGGGAGA ACATCTGACA TGAGGACCGA AATCATAAGG 2580 

ATGATGGAAA GTGCAAGACC AGAAGATGTG TCTTTCCAGG GGCGGGGAGT CTTCGAGCTC 2640 

TCGGACGAAA AGGCAGCGAG CCCGATCGTG CCTTCCTTTG ACAT6AGTAA TGAAGGATCT 2700 

75 

TATTTCTTCG GAGACAATGC AGAGGAGTAC GACAATCGCG AGGACCTGAA GTAA 2754 
(2) INFORMATION FOR SEQ ID NO: 54: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 918 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: protein 



3Q (xi) SEQUENCE DESCRIPTION: SEQ ID N0:54: 

Met Ala Glu Glu Ala Phe Asp Leu Trp Asn Glu Cys Ala Lys Ala Cys 
15 10 15 

35 Vai Leu Asp Leu Lys Asp Gly Vai Arg Ser Ser Arg Met Ser Vai Asp 

20 25 30 

Pro Ala lie Ala Asp Thr Asn Gly Gin Gly Va1 Leu His Tyr Ser Met 
35 40 45 

40 

Vai Leu Glu Gly Gly Asn Asp Ala Leu Lys Leu Ala lie Asp Asn Ala 
50 55 60 

Leu Ser lie Thr Ser Asp Gly Leu Thr lie Arg Leu Glu Gly Gly Vai 
65 70 75 80 

45 

Glu Pro Asn Lys Pro Vai Arg Tyr Ser Tyr Thr Arg Gin Ala Arg Gly 
85 90 95 
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15 



Ser Trp Ser Leu Asn Trp Leu Val Pro He Gly His Glu Lys Pro Ser 

100 105 no 

Asn lie Lys Val Phe lie His Glu Leu Asn Ala Gly Asn Gin Leu Ser 
115 120 125 

His Met Ser Pro lie Tyr Thr He Glu Met Gly Asp Glu Leu Leu Ala 
130 135 140 

Lys Leu Ala Arg Asp Ala Thr Phe Phe Val Arg Ala His Glu Ser Asn 
145 150 155 160 

Glu Met Gin Pro Thr Leu Ala lie Ser His Ala Gly Val Ser Val Val 
165 170 175 

Met Ala Gin Thr Gin Pro Arg Arg Glu Lys Arg Trp Ser Glu Trp Ala 

180 185 190 



20 



Ser Gly Lys Val Leu Cys Leu Leu Asp Pro Leu Asp Gly Val Tyr Asn 
195 200 205 



50 



Tyr Leu Ala Gin Gin Arg Cys Asn Leu Asp Asp Thr Trp Glu Gly Lys 
210 215 220 



25 



lie Tyr Arg Val Leu Ala Gly Asn Pro Ala Lys His Asp Leu Asp lie 
225 230 235 240 



30 



Lys Pro Thr Val He Ser His Arg Leu His Phe Pro Glu Gly Gly Ser 

245 250 255 

Leu Ala Ala Leu Thr Ala His Gin Ala Cys His Leu Pro Leu Glu Thr 

260 265 270 



35 



Phe Thr Arg His Arg Gin Pro Arg Gly Trp Glu Gin Leu Glu Gin Cys 
275 280 285 



Gly Tyr Pro Val Gin Arg Leu Val Ala Leu Tyr Leu Ala Ala Arg Leu 
290 295 300 



40 



Ser Trp Asn Gin Val Asp Gin Val lie Arg Asn Ala Leu Ala Ser Pro 

305 310 315 320 



45 



Gly Ser Gly Gly Asp Leu Gly Glu Ala He Arg Glu Gin Pro Glu Gin 

325 330 335 

Ala Arg Leu Ala Leu Thr Leu Ala Ala Ala Glu Ser Glu Arg Phe Val 

340 345 350 
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Arg Gin Gly Thr Gly Asn Asp Glu Ala Gly Ala Ala Asn Ala Asp Val 
355 360 365 

Val Ser Leu Thr Cys Pro Val Ala Ala Gly Glu Cys Ala Gly Pro Ala 
370 375 380 

Asp Ser Gly Asp Ala Leu Leu Glu Arg Asn Tyr Pro Thr Gly Ala Glu 
385 390 395 400 



70 



Phe Leu Gly Asp Gly Gly Asp Val Ser Phe Ser Thr Arg Gly Met Ala 
405 410 415 



75 



Ser Gin Gly Thr Lys Arg Ser Tyr Glu Gin Met Glu Thr Asp Gly Glu 

420 425 430 

Arg Gin Asn Ala Thr Glu lie Arg Ala Ser Val Gly Lys Met lie Gly 

435 440 445 



20 



Gly He Gly Arg Phe Tyr He Gin Met Cys Thr Glu Leu Lys Leu Ser 
450 455 460 



Asp Tyr Glu Gly Arg Leu He Gin Asn Ser Leu Thr He Glu Arg Met 
465 470 475 480 



25 



Val Leu Ser Ala Phe Asp Glu Arg Arg Asn Lys Tyr Leu Glu Glu His 
485 490 495 



30 



Pro Ser Ala Gly Lys Asp Pro Lys Lys Thr Gly Gly Pro lie Tyr Arg 
500 505 510 

Arg Val Asn Gly Lys Trp Met Arg Glu Leu He Leu Tyr Asp Lys Glu 
' 515 520 525 



35 



Glu He Arg Arg He Trp Arg Gin Ala Asn Asn Gly Asp Asp Ala Thr 
530 535 540 



Ala Gly Leu Thr His Met Met He Trp His Ser Asn Leu Asn Asp Ala 
545 550 555 560 



40 



Thr Tyr Gin Arg Thr Arg Ala Leu Val Arg Thr Gly Met Asp Pro Arg 
565 570 575 



45 



Met Cys Ser Leu Met Gin Gly Ser Thr Leu Pro Arg Arg Ser Gly Ala 
580 585 590 

Ala Gly Ala Ala Val Lys Gly Val Gly Thr Met Val Met Glu Leu Val 
595 600 605 
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Arg Met He Lys Arg Gly He Asn Asp Arg Asn Phe Trp Arg Gly Gl u 
610 615 620 

Asn Gly Arg Lys Thr Arg lie Ala Tyr Glu Arg Het Cys Asn He Leu 

625 630 635 640 

Lys Gly Lys Phe Gin Thr Ala Ala Gin Lys Ala Met Met Asp Gin Val 

645 650 655 



W 



Arg Glu Ser Arg Asp Pro Gly Asn Ala Glu Phe Glu Asp Leu Thr Phe 
660 665 670 



75 



Leu Ala Arg Ser Ala Leu lie Leu Arg Gly Ser Val Ala His Lys Ser 

675 680 685 

Cys Leu Pro Ala Cys Val Tyr Gly Pro Ala Val Ala Ser Gly Tyr Asp 

690 695 700 



20 



Phe Glu Arg Glu Gly Tyr Ser Leu Val Gly lie Asp Pro Phe Arg Leu 
705 710 715 720 



Leu Gin Asn Ser Gin Val Tyr Ser Leu lie Arg Pro Asn Glu Asn Pro 
725 730 735 



25 



Ala His Lys Ser Gin Leu Val Trp Met Ala Cys His Ser Ala Ala Phe 
740 745 750 



30 



Glu Asp Leu Arg Val Leu Ser Phe He Lys. Gly Thr Lys Val Val Pro 

755 760 765 

Arg Gly Lys Leu Ser Thr Arg Gly Val Gin He Ala Ser Asn Glu Asn 

770 775 780 



35 



Met Glu Thr Met Glu Ser Ser Thr Leu Glu Leu Arg Ser Arg Tyr Trp 
785 790 795 800 



Ala lie Arg Thr Arg Ser Gly Gly Asn Thr Asn Gin Gin Arg Ala Ser 
805 810 815 



40 



Ala Gly Gin lie Ser lie Gin Pro Thr Phe Ser Val Gin Arg Asn Leu 
820 825 830 



45 



Pro Phe Asp Arg Thr Thr Val Met Ala Ala Phe Thr Gly Asn Thr Glu 

835 840 845 

Gly Arg Thr Ser Asp Met Arg Thr Glu He lie Arg Met Met Glu Ser 
850 855 860 
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Ala Arg Pro Glu Asp Val Ser Phe Gin Gly Arg Gly Val Phe Glu Leu 
865 870 875 880 

Ser Asp Glu Lys Ala Ala Ser Pro He Val Pro Ser Phe Asp Met Ser 
885 890 895 

Asn Glu Gly Ser Tyr Phe Phe Gly Asp Asn Ala Glu Glu Tyr Asp Asn 
900 905 910 

Arg Glu Asp Leu Lys Xaa 
915 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 
ATACCCGCGG CATGGGTGCG AGAGCGTCGG TATAT 
(2) INFORMATION FOR SEQ ID N0:56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: 
ATAGAATTCT CATTGTGACG AGGGGTCGCT GCCAAA 
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(2) INFORMATION FOR SEQ ID N0:57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2814 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEONESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

<xi) SEQUENCE DESCRIPTION: SEQ ID N0:57: 

ATGAAAAAGA CAGCTATCGC GATTGCAGTG GCACTGGCTG GTTTCGCTAC CGTAGCGCAG 60 

GCCGCGAATT TGGCCGAAGA AGCTTTCGAC CTCTGGAACG AATGCGCCAA AGCCTGCGTG 120 

CTCGACCTCA AGGACGGCGT GCGTTCCAGC CGCATGAGCG TCGACCCGGC CATCGCCGAC 180 

ACCAACGGCC AGGGCGTGCT GCACTACTCC ATGGTCCTGG AGGGCGGCAA CGACGCGCTC 240 

AAGCTGGCCA TCGACAACGC CCTCAGCATC ACCAGCGACG GCCTGACCAT CCGCCTCGAA 300 

GGCGGCGTCG AGCCGAACAA GCCGGTGCGC TACAGCTACA CGCGCCAGGC GCGCGGCAGT 360 

TGGTCGCTGA ACTGGCTGGT ACCGATCGGC CACGAGAAGC CCTCGAACAT CAAGGTGTTC 420 

ATCCACGAAC TGAACGCCGG CAACCAGCTC AGCCACATGT CGCCGATCTA CACCATCGAG 480 

ATGGGCGACG AGTTGCTGGC GAAGCTGGCG CGCGATGCCA CCTTCTTCGT CAGGGCGCAC 540 

GAGAGCAACG AGATGCAGCC GACGCTCGCC ATCAGCCATG CCGGGGTCAG CGTGGTCATG 600 

GCCCAGACCC AGCCGCGCCG GGAAAAGCGC TGGAGCGAAT GGGCCAGCGG CAAGGTGTTG 660 

TGCCTGCTCG ACCCGCTGGA CGGGGTCTAC AACTACCTCG CCCAGCAACG CTGCAACCTC 720 

GACGATACCT GGGAAGGCAA GATCTACCGG GTGCTCGCCG GCAACCCGGC GAAGCATGAC 780 

CTGGACATCA AACCCACGGT CATCAGTCAT CGCCTGCACT TTCCCGAGGG CGGCAGCCTG 840 

GCCGCGCTGA CCGCGCACCA GGCTTGCCAC CTGCCGCTGG AGACTTTCAC CCGTCATCGC 900 

CAGCCGCGCG GCTGGGAACA ACTGGAGCAG TGCGGCTATC CGGTGCAGCG GCTGGTCGCC 960 

CTCTACCTGG CGGCGCGGCT GTCGTGGAAC CAGGTCGACC AGGTGATCCG CAACGCCCTG 1020 
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GCCAGCCCCG GCAGCGGCGG CGACCTGGGC GAAGCGATCC GCGAGCAGCC GGAGCAGGCC 1080 

CGTCTGGCCC TGACCCTGGC CGCCGCCGAG AGCGAGCGCT TCGTCCGGCA GGGCACCGGC 1140 

AACGACGAGG CCGGCGCGGC CAACGCCGAC GTGGTGAGCC TGACCTGCCC GGTCGCCGCC 1200 

GGTGAATGCG CGGGCCCGGC GGACAGCGGC GACGCCCTGC TGGAGCGCAA CTATCCCACT 1260 

GGCGCGGAGT TCCTCGGCGA CGGCGGCGAC GTCAGCTTCA GCACCCGCGG CATGGGTGCG 1320 



AGAGCGTCGG TATTAAGCGG GGGAGAATTA GATAAATGGG AAAAAATTCG GTTAAGGCCA 1380 

GGGGGAAAGA AACAATATAA ACTAAAACAT ATAGTATGGG CAAGCAGGGA GCTAGAACGA 1440 

TTCGCAGTTA ATCCTGGCCT TTTAGAGACA TCAGAAGGCT GTAGACAAAT ACTGGGACAG 1500 

CTACAACCAT CCCTTCAGAC AGGATCAGAA GAACTTAGAT CATTATATAA TACAATAGCA 1560 

GTCCTCTATT GTGTGCATCA AAGGATAGAT GTAAAAGACA CCAAGGAAGC CTTAGATAAG 1620 



ATAGAGGAAG AGCAAAACAA AAGTAAGAAA AAGGCACAGC AAGCAGCAGC TGACACAGGA 1680 

AACAACAGCC AGGTCAGCCA AAATTACCCT ATAGTGCAGA ACCTCCAGGG GCAAATGGTA 1740 

CATCAGGCCA TATCACCTAG AACTTTAAAT GCATGGGTAA AAGTAGTAGA AGAGAAGGCT 1800 

TTCAGCCCAG AAGTAATACC CATGTTTTCA GCATTATCAG AAGGAGCCAC CCCACAAGAT 1860 

TTAAATACCA TGCTAAACAC AGTGGGGGGA CATCAAGCAG CCATGCAAAT GTTAAAAGAG 1920 



ACCATCAATG AGGAAGCTGC AGAATGGGAT AGATTGCATC CAGTGCATGC AGGGCCTATT 1980 

GCACCAGGCC AGATGAGAGA ACCAAGGGGA AGTGACATAG CAGGAACTAC TAGTACCCTT 2040 

CAGGAACAAA TAGGATGGAT GACACATAAT CCACCTATCC CAGTAGGAGA AATCTATAAA 2100 

AGATGGATAA TCCTGGGATT AAATAAAATA GTAAGAATGT ATAGCCCTAC CAGCATTCTG 2160 

GACATAAGAC AAGGACCAAA GGAACCCTTT AGAGACTATG TAGACCGATT CTATAAAACT 2220 



CTAAGAGCCG AGCAAGCTTC ACAAGAGGTA AAAAATTGGA TGACAGAAAC CTTGTTGGTC 2280 

CAAAATGCGA ACCCAGATTG TAAGACTATT TTAAAAGCAT TGGGACCAGG AGCGACACTA 2340 

GAAGAAATGA TGACAGCATG TCAGGGAGTG GGGGGACCCG GCCATAAAGC AAGAGTTTTG 2400 

GCTGAAGCAA TGAGCCAAGT AACAAATCCA GCTACCATAA TGATACAGAA AGGCAATTTT 2460 
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AGGAACCAAA GAAAGACTGT TAAGTGTTTC AATTGTGGCA AAGAAGGGCA CATAGCCAAA 2520 

AATTGCAGGG CCCCTAGGAA AAAGGGCTGT TGGAAATGTG GAAAGGAAGG ACACCAAATG 2580 

AAAGATTGTA CTGAGAGACA GGCTAATTTT TTAGGGAAGA TCTGGCCTTC CCACAAGGGA 2640 

AGGCCAGGGA ATTTTCTTCA GAGCAGACCA GAGCCAACAG CCCCACCAGA AGAGAGCTTC 2700 

AGGTTTGGGG AAGAGACAAC AACTCCCTCT CAGAAGCAGG AGCCGATAGA CAAGGAACTG 2760 

TATCCTTTAG CTTCCCTCAG ATCACTCTTT GGCAGCGACC CCTCGTCACA ATGA 2814 
(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 938 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:58: 

Met Lys tys Thr Ala lie Ala He Ala Val Ala Leu Ala Gly Phe Ala 
15 10 15 

Thr Val Ala Gin Ala Ala Asn Leu Ala Glu Glu Ala Phe Asp Leu Trp 
20 25 30 

Asn Glu Cys Ala Lys Ala Cys Val Leu Asp Leu Lys Asp Gly Val Arg 
35 40 45 

Ser Ser Arg Met Ser Val Asp Pro Ala He Ala Asp Thr Asn Gly Gin 
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Gly Val Leu His Tyr Ser Met Val Leu Glu Gly Gly Asn Asp 
65 70 75 



Ala Leu 
80 



Lys Leu Ala He Asp Asn Ala Leu Ser lie Thr Ser Asp Gly 
85 90 



Leu Thr 
95 



lie Arg Leu Glu Gly Gly Val Glu Pro Asn Lys Pro Val Arg 
100 105 HO 



Tyr Ser 



Tyr Thr Arg Gin Ala Arg Gly Ser Trp Ser Leu Asn Trp Leu 
115 120 125 



Val Pro 
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lie Gly His Glu Lys Pro Ser Asn He Lys Val Phe lie His Glu Leu 
130 135 140 

Asn Ala Gly Asn Gin Leu Ser His Met Ser Pro lie Tyr Thr He Glu 
145 150 155 160 

Met Gly Asp Glu Leu Leu Ala Lys Leu Ala Arg Asp Ala Thr Phe Phe 
165 170 175 

Val Arg Ala His Glu Ser Asn Glu Met Gin Pro Thr Leu Ala lie Ser 
180 185 190 

His Ala Gly Val Ser Val Val Met Ala Gin Thr Gin Pro Arg Arg Glu 
195 200 205 

Lys Arg Trp Ser Glu Trp Ala Ser Gly Lys Val Leu Cys Leu Leu Asp 
210 215 220 

Pro Leu Asp Gly Val Tyr Asn Tyr Leu Ala Gin Gin Arg Cys Asn Leu 
225 230 235 240 

Asp Asp Thr Trp Glu Gly Lys lie Tyr Arg -Val Leu Ala Gly Asn Pro 
245 250 255 

Ala Lys His Asp Leu Asp lie Lys Pro Thr Val He Ser His Arg Leu 
260 265 270 

His Phe Pro Glu Gly Gly Ser Leu Ala Ala Leu Thr Ala His Gin Ala 
275 280 285 

Cys^His Leu Pro Leu Glu Thr Phe Thr Arg His Arg Gin Pro Arg Gly 
290 295 300 

Trp Glu Gin Leu Glu Gin Cys Gly Tyr Pro Val Gin Arg Leu Val Ala 
305 310 315 320 

Leu Tyr Leu Ala Ala Arg Leu Ser Trp Asn Gin Val Asp Gin Val He 

325 330 335 

Arg Asn Ala Leu Ala Ser Pro Gly Ser Gly Gly Asp Leu Gly Glu Ala 
340 _ 345 350 

He Arg Glu Gin Pro Glu Gin Ala Arg Leu Ala Leu Thr Leu Ala Ala 
355 360 365 



Ala Glu Ser Glu Arg Phe Val Arg Gin Gly Thr Gly Asn Asp Glu Ala 
370 375 380 
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Gly 
385 



Ala Ala Asn Ala Asp Val Val Ser Leu Thr Cys Pro Val Ala Ala 



390 



395 



Gly 61 u Cys Ala Gly Pro Ala Asp Ser Gly Asp Ala Leu Leu Glu Arg 



405 



Asn Tyr Pro Thr Gly Ala Glu Phe Leu Gly Asp Gly Gly AjJ Val S»r 



420 



425 



Phe Ser Thr Arg Gly Met Gly Ala Arg Ala Ser Val Leu Ser Gly Gly 



435 

Glu Leo Asp Lys 
450 



440 



Trp Glu Lys He Arg Leu Arg Pro Gly Gly Lys Lys 



455 



460 



Gin Tyr Lys Leu Lys His 



is He Val Trp Ala^Ser Arg Glu Leu Glu Arg 



465 



470 



475 



480 



Phe Ala Val Asn Pro Gly Leu Leu Glu Thr Ser Glu Gly Cys Arg Gin 



485 



490 



ne Leu Gly Gin Leu Gin Pro Ser Leu Gin Thr Gly Ser Glu Glu Leu 



500 



505 



Arg 



Ser Leu Tyr Asn Thr He Ala Val Leu Tyr Cys Val His Gin Arg 



515 



520 



525 



lie Asp Val Lys Asp Thr Lys Glu Ala Leu Asp Lys He Glu Glu Glu 



530 



535 



540 



Gin Asn Lys Ser Lys Lys Lys Ala Gin Gin Ala Ala Ala Asp Thr Gly 



545 



550 



555 



Asn Asn Ser Gin Val Ser Gin Asn Tyr Pro He Val Gin Asn Leu Gin 



565 



570 



Gly Gin Met Val His Gin Ala He Ser Pro Arg Thr Leu Asn Ala Trp 



580 



585 



Val Lys Val Val Glu Glu Lys Ala Phe Ser Pro Glu Val He Pro Met 



595 



600 



Phe Ser Ala Leu Ser Glu Gly Ala Thr Pro Gin Asp Leu A.n-Thr Met 

615 620 



610 

Leu Asn Thr 
625 



Val Gly Gly His Gin Ala Ala Met Gin Met Leu Lys Glu 



630 



635 
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Thr He Asn Glu Glu Ala Ala Glu Trp Asp Arg Leu His Pro Val His 
645 650 655 

Ala Gly Pro lie Ala Pro Gly Gin Met Arg Glu Pro Arg Gly Ser Asp 
660 665 670 

He Ala Gly Thr Thr Ser Thr Leu Gin Glu Gin lie Gly Trp Met Thr 
675 680 685 

His Asn Pro Pro lie Pro Val Gly Glu lie Tyr Lys Arg Trp lie lie 
690 695 700 

Leu Gly Leu Asn Lys lie Val Arg Met Tyr Ser Pro Thr Ser lie Leu 
705 710 715 720 

Asp He Arg Gin Gly Pro Lys Glu Pro Phe- Arg Asp Tyr Val Asp Arg 
725 730 735 

Phe Tyr Lys Thr Leu Arg Ala Glu Gin Ala Ser Gin Glu Val Lys Asn 
740 745 750 

Trp Met Thr Glu Thr Leu Leu Val Gin Asn Ala Asn Pro Asp Cys Lys 
755 760 765 

Thr He Leu Lys Ala Leu Gly Pro Gly Ala Thr Leu Glu Glu Met Met 
770 775 780 

Thr Ala Cys Gin Gly Val Gly Gly Pro Gly His Lys Ala Arg Val Leu 
785 790 795 800 

Ala Glu Ala Met Ser Gin Val Thr Asn Pro Ala Thr lie Met lie Gin 
805 810 815 

Lys Gly Asn Phe Arg Asn Gin Arg Lys Thr Val Lys Cys Phe Asn Cys 
820 825 830 

Gly Lys Glu Gly His lie Ala Lys Asn Cys Arg Ala Pro Arg Lys Lys 
835 840 845 

Gly Cys Trp Lys Cys Gly Lys Glu Gly His Gin Met Lys Asp Cys Thr 
850 855 860 

Glu Arg Gin Ala Asn Phe Leu Gly Lys lie Trp Pro Ser His Lys Gly 
865 870 875 880 



Arg Pro Gly Asn Phe Leu Gin Ser Arg Pro Glu Pro Thr Ala Pro Pro 
885 890 895 
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Glu Glu Ser Phe Arg Phe Gly Glu Glu Thr Thr Thr Pro Ser Gin Lys 
900 905 910 

Gin Glu Pro He Asp Lys Glu Leu Tyr Pro Leu Ala Ser Leu Arg Ser 
915 920 925 



Leu Phe Gly Ser Asp Pro Ser Ser Gin Xaa 
930 935 



TO 



Claims 

75 1. A hybrid protein comprising: 

(a) a modified bacterial toxin that has a translocating domain, and 

(b) a polypeptide or protein that is exogenous to an antigen-presenting cell, 
said hybrid capable of eliciting an immune response by cytotoxic T lymphocytes. 

20 2. A hybrid protein comprising: 

(a) a modified Pseudomonas exotoxin: and 

(b) a polypeptide or protein that is exogenous to an antigen-presenting cell; 
said hybrid capable of eliciting an immune response by cytotoxic T lymphocytes. 

25 3. A hybrid protein comprising: 

(a) a modified Pseudomonas exotoxin; and 

(b) a polypeptide or protein that is exogenous to an antigen-presenting cell; 

said hybrid capable of being at least partially presented on an antigen-presenting cell surface. 

30 4. A hybrid protein comprising: 

(a) a modified Pseudomonas exotoxin; and 

(b) a polypeptide or protein of viral, parasitic or tumor origin; 

said hybrid capable of being at least partially presented on an antigen-presenting cell surface. 

35 5. A hybrid protein comprising: 

(a) a modified Pseudomonas exotoxin; and 

(b) a polypeptide or protein of viral origin; 

said hybrid capable of being internalized by an antigen-presenting cell and further capable of being at 
least partially presented on the surface of said antigen-presenting cell. 

40 

6. A hybrid protein comprising: 

(a) a modified Pseudomonas exotoxin; and 

(b) a polypeptide or protein of viral origin; 
said hybrid capable of being internalized by an antigen-presenting cell and further capable of being 
processed for at least partial presentation on the surface of said antigen-presenting cell sufficiently to 
elicit an immune response by cytotoxic T lymphocytes. 

7. The hybrid protein as claimed in claim 1, wherein said modified bacterial toxin further comprises a 
cellular recognition domain. 

8. The hybrid protein as claimed in claim 2, wherein said modified Pseudomonas exotoxin lacks a 
functioning ADP ribosylating domain. 

9. The hybrid protein as claimed in claim 2, wherein said modified Pseudomonas exotoxin comprises a 
55 cellular recognition domain and a translocating domain. 

10. The hybrid protein as claimed in claim 2, wherein said modified Pseudomonas exotoxin comprises 
structural domains la, II and lb. 
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11. The hybrid protein as claimed in claim 2, wherein said modified Pseudomona s exotoxin is arranged on 
the > ammo-terminal side of said hybrid and said polypeptide is arranged on the carboxyl-terminal side of 
said hybrid protein. 

12. The hybrid protein as claimed in claim 2, wherein said polypeptide or protein is a viral protein fragment. 

13. The hybrid protein as claimed in claim 12, wherein said viral protein fragment comprises the matrix 
protein of influenza A virus. 

14. The hybrid protein as claimed in claim 12. wherein said viral protein fragment comprises residues 57 to 
68 of the matrix protein of influenza A virus. 

15. The hybrid protein as claimed in claim 12, wherein said viral protein fragment is sufficiently specific to 
bind to HLA-A2. 

16. The hybrid protein as claimed in claim 12, wherein -said viral protein fragment comprises the 
nucleoprotein of influenza A virus. 

17. The hybrid protein as claimed in claim 12, wherein said viral protein fragment comprises the qaq 
protein of human immunodeficiency virus-1. 

18. The hybrid protein as claimed in claim 1. wherein said polypeptide or protein is an antigen for use as a 
vaccine. 

19. The hybrid protein as claimed in claim 18, wherein said antigen for use as a vaccine is a viral antigen. 

20. The hybrid protein as claimed in claim 19, wherein said viral antigen is a conserved viral protein. 

21. The hyrid as claimed in claim 11 additionally comprising the peptide sequence Arg Glu Asp Leu Lys 
arranged on the carboxyl-terminal end of said polypeptide. 

22. The hybrid protein as claimed in claim 21, and having the sequence described in Sequence ID No 35 
or 38. 

23. The hybrid protein as claimed in claim 8, wherein said Pseudomonas exotoxin further comprises an 
antigen peptide sequence inserted into structural domain III of said Pseudomonas exotoxin whose 
structural domain III cannot function as an ADP ribosylation domain. 

24. The hybrid protein as claimed in claim 23, and having the sequence described in Sequence ID No. 19. 

25. The hybrid protein as claimed in claim 23, and having the sequence described in Sequence ID No. 22. 

26. A vaccine comprising a pharmaceutical^ acceptable carrier and an amount of the hybrid protein as 
claimed in claim 1 sufficient to elicit an immune response by cytotoxic T lymphocytes. 

27. The vaccine as claimed in claim 26, wherein said hybrid protein comprises a modified Pseudomonas 
exotoxin and the matrix protein of influenza A virus. 

28. The vaccine as claimed in claim 26, wherein said hybrid protein comprises a modified Pseudomonas 
exotoxin and residues 57 to 68 of the matrix protein of influenza A virus. 

29. The vaccine as claimed in claim 26, wherein said hybrid protein comprises a modified Pseudomonas 
exotoxin and the nucleoprotein of influenza A. 

30. The vaccine as claimed in claim 26, wherein said hybrid protein comprises a modified Pseudomonas 
exotoxin and the gag protein of human immunodeficiency virus-1 . 
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31 The vaccine as claimed in claim 26. sufficient to immunize a host against influenza^ acquired 
LLncSSency syndrome, human papi.loma virus, cytomegalovirus. Epstein-Barr v.rus. Rota v.rus. 

or respiratory syncytial virus. 
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© Recombinant hybrid proteins having two primary 
components. The first component is a modified bac- 
terial toxin that has translocating ability, while the 
second component is a polypeptide or protein that is 
exogenous to an antigen-presenting cell. The hybrid 
has the ability to be internalized by an antigen- 
presenting cell, where the hybrid is subsequently 
processed and an antigenic segment of the hybrid 
presented on the surface of the antigen-presenting 
cell, where the segment elicits an immune response 
by cytotoxic T lymphocytes. 
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